1
|
Gan S, Kim C, Chang J, Lee DY, Park RW. Enhancing readmission prediction models by integrating insights from home healthcare notes: Retrospective cohort study. Int J Nurs Stud 2024; 158:104850. [PMID: 39024965 DOI: 10.1016/j.ijnurstu.2024.104850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 06/24/2024] [Accepted: 06/27/2024] [Indexed: 07/20/2024]
Abstract
BACKGROUND Hospital readmission is an important indicator of inpatient care quality and a significant driver of increasing medical costs. Therefore, it is important to explore the effects of postdischarge information, particularly from home healthcare notes, on enhancing readmission prediction models. Despite the use of Natural Language Processing (NLP) and machine learning in prediction model development, current studies often overlook insights from home healthcare notes. OBJECTIVE This study aimed to develop prediction models for 30-day readmissions using home healthcare notes and structured data. In addition, it explored the development of 14- and 180-day prediction models using variables in the 30-day model. DESIGN A retrospective observational cohort study. SETTING(S) This study was conducted at Ajou University School of Medicine in South Korea. PARTICIPANTS Data from electronic health records, encompassing demographic characteristics of 1819 participants, along with information on conditions, drug, and home healthcare, were utilized. METHODS Two distinct models were developed for each prediction window (30-, 14-, 180-day): the traditional model, which utilized structured variables alone, and the common data model (CDM)-NLP model, which incorporated structured and topic variables extracted from home healthcare notes. BERTopic facilitated topic generation and risk probability, representing the likelihood of documents being assigned to specific topics. Feature selection involved experimenting with various algorithms. The best-performing algorithm, determined using the area under the receiver operating characteristic curve (AUROC), was used for model development. Model performance was assessed using various learning metrics including AUROC. RESULTS Among 1819 patients, 251 (13.80 %) experienced 30-day readmission. The least absolute shrinkage and selection operator was used for feature extraction and model development. The 15 structured features were used in the traditional model. Moreover, five additional topic variables from the home healthcare notes were applied in the CDM-NLP model. The AUROC of the traditional model was 0.739 (95 % CI: 0.672-0.807). The AUROC of the CDM-NLP model was high at 0.824 (95 % CI: 0.768-0.880), which indicated an outstanding performance. The topics in the CDM-NLP model included emotional distress, daily living functions, nutrition, postoperative status, and cardiorespiratory issues. In extended prediction model development for 14- and 180-day readmissions, the CDM-NLP consistently outperformed the traditional model. CONCLUSIONS This study developed effective prediction models using both structured and unstructured data, thereby emphasizing the significance of postdischarge information from home healthcare notes in readmission prediction.
Collapse
Affiliation(s)
- Sujin Gan
- Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, Gyeonggi-do, Republic of Korea.
| | - Chungsoo Kim
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale University School of Medicine, New Haven, CT, USA
| | - Junhyuck Chang
- Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, Gyeonggi-do, Republic of Korea
| | - Dong Yun Lee
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Gyeonggi-do, Republic of Korea
| | - Rae Woong Park
- Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, Gyeonggi-do, Republic of Korea; Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Gyeonggi-do, Republic of Korea.
| |
Collapse
|
2
|
Clarke E, Chehoud C, Khan N, Spiessens B, Poolman J, Geurtsen J. Unbiased identification of risk factors for invasive Escherichia coli disease using machine learning. BMC Infect Dis 2024; 24:796. [PMID: 39118021 PMCID: PMC11308465 DOI: 10.1186/s12879-024-09669-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 07/25/2024] [Indexed: 08/10/2024] Open
Abstract
BACKGROUND Invasive Escherichia coli disease (IED), also known as invasive extraintestinal pathogenic E. coli disease, is a leading cause of sepsis and bacteremia in older adults that can result in hospitalization and sometimes death and is frequently associated with antimicrobial resistance. Moreover, certain patient characteristics may increase the risk of developing IED. This study aimed to validate a machine learning approach for the unbiased identification of potential risk factors that correlate with an increased risk for IED. METHODS Using electronic health records from 6.5 million people, an XGBoost model was trained to predict IED from 663 distinct patient features, and the most predictive features were identified as potential risk factors. Using Shapley Additive predictive values, the specific relationships between features and the outcome of developing IED were characterized. RESULTS The model independently predicted that older age, a known risk factor for IED, increased the chance of developing IED. The model also predicted that a history of ≥ 1 urinary tract infection, as well as more frequent and/or more recent urinary tract infections, and ≥ 1 emergency department or inpatient visit increased the risk for IED. Outcomes were used to calculate risk ratios in selected subpopulations, demonstrating the impact of individual or combinations of features on the incidence of IED. CONCLUSION This study illustrates the viability and validity of using large electronic health records datasets and machine learning to identify correlating features and potential risk factors for infectious diseases, including IED. The next step is the independent validation of potential risk factors using conventional methods.
Collapse
Affiliation(s)
- Erik Clarke
- Janssen Research and Development Data Sciences, Spring House, PA, USA
| | - Christel Chehoud
- Janssen Research and Development Data Sciences, Spring House, PA, USA
| | - Najat Khan
- Janssen Research and Development Data Sciences, Spring House, PA, USA
| | | | - Jan Poolman
- Janssen Vaccines and Prevention, Leiden, The Netherlands
| | | |
Collapse
|
3
|
Seinen TM, Kors JA, van Mulligen EM, Rijnbeek PR. Annotation-preserving machine translation of English corpora to validate Dutch clinical concept extraction tools. J Am Med Inform Assoc 2024; 31:1725-1734. [PMID: 38934643 PMCID: PMC11258409 DOI: 10.1093/jamia/ocae159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 05/24/2024] [Accepted: 06/10/2024] [Indexed: 06/28/2024] Open
Abstract
OBJECTIVE To explore the feasibility of validating Dutch concept extraction tools using annotated corpora translated from English, focusing on preserving annotations during translation and addressing the scarcity of non-English annotated clinical corpora. MATERIALS AND METHODS Three annotated corpora were standardized and translated from English to Dutch using 2 machine translation services, Google Translate and OpenAI GPT-4, with annotations preserved through a proposed method of embedding annotations in the text before translation. The performance of 2 concept extraction tools, MedSpaCy and MedCAT, was assessed across the corpora in both Dutch and English. RESULTS The translation process effectively generated Dutch annotated corpora and the concept extraction tools performed similarly in both English and Dutch. Although there were some differences in how annotations were preserved across translations, these did not affect extraction accuracy. Supervised MedCAT models consistently outperformed unsupervised models, whereas MedSpaCy demonstrated high recall but lower precision. DISCUSSION Our validation of Dutch concept extraction tools on corpora translated from English was successful, highlighting the efficacy of our annotation preservation method and the potential for efficiently creating multilingual corpora. Further improvements and comparisons of annotation preservation techniques and strategies for corpus synthesis could lead to more efficient development of multilingual corpora and accurate non-English concept extraction tools. CONCLUSION This study has demonstrated that translated English corpora can be used to validate non-English concept extraction tools. The annotation preservation method used during translation proved effective, and future research can apply this corpus translation method to additional languages and clinical settings.
Collapse
Affiliation(s)
- Tom M Seinen
- Department of Medical Informatics, Erasmus University Medical Center, 3015 GD Rotterdam, The Netherlands
| | - Jan A Kors
- Department of Medical Informatics, Erasmus University Medical Center, 3015 GD Rotterdam, The Netherlands
| | - Erik M van Mulligen
- Department of Medical Informatics, Erasmus University Medical Center, 3015 GD Rotterdam, The Netherlands
| | - Peter R Rijnbeek
- Department of Medical Informatics, Erasmus University Medical Center, 3015 GD Rotterdam, The Netherlands
| |
Collapse
|
4
|
John LH, Fridgeirsson EA, Kors JA, Reps JM, Williams RD, Ryan PB, Rijnbeek PR. Development and validation of a patient-level model to predict dementia across a network of observational databases. BMC Med 2024; 22:308. [PMID: 39075527 PMCID: PMC11288076 DOI: 10.1186/s12916-024-03530-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 07/15/2024] [Indexed: 07/31/2024] Open
Abstract
BACKGROUND A prediction model can be a useful tool to quantify the risk of a patient developing dementia in the next years and take risk-factor-targeted intervention. Numerous dementia prediction models have been developed, but few have been externally validated, likely limiting their clinical uptake. In our previous work, we had limited success in externally validating some of these existing models due to inadequate reporting. As a result, we are compelled to develop and externally validate novel models to predict dementia in the general population across a network of observational databases. We assess regularization methods to obtain parsimonious models that are of lower complexity and easier to implement. METHODS Logistic regression models were developed across a network of five observational databases with electronic health records (EHRs) and claims data to predict 5-year dementia risk in persons aged 55-84. The regularization methods L1 and Broken Adaptive Ridge (BAR) as well as three candidate predictor sets to optimize prediction performance were assessed. The predictor sets include a baseline set using only age and sex, a full set including all available candidate predictors, and a phenotype set which includes a limited number of clinically relevant predictors. RESULTS BAR can be used for variable selection, outperforming L1 when a parsimonious model is desired. Adding candidate predictors for disease diagnosis and drug exposure generally improves the performance of baseline models using only age and sex. While a model trained on German EHR data saw an increase in AUROC from 0.74 to 0.83 with additional predictors, a model trained on US EHR data showed only minimal improvement from 0.79 to 0.81 AUROC. Nevertheless, the latter model developed using BAR regularization on the clinically relevant predictor set was ultimately chosen as best performing model as it demonstrated more consistent external validation performance and improved calibration. CONCLUSIONS We developed and externally validated patient-level models to predict dementia. Our results show that although dementia prediction is highly driven by demographic age, adding predictors based on condition diagnoses and drug exposures further improves prediction performance. BAR regularization outperforms L1 regularization to yield the most parsimonious yet still well-performing prediction model for dementia.
Collapse
Affiliation(s)
- Luis H John
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands.
| | - Egill A Fridgeirsson
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Jan A Kors
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Jenna M Reps
- Janssen Research and Development, Raritan, NJ, USA
| | - Ross D Williams
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | | | - Peter R Rijnbeek
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| |
Collapse
|
5
|
Guo LL, Fries J, Steinberg E, Fleming SL, Morse K, Aftandilian C, Posada J, Shah N, Sung L. A multi-center study on the adaptability of a shared foundation model for electronic health records. NPJ Digit Med 2024; 7:171. [PMID: 38937550 PMCID: PMC11211479 DOI: 10.1038/s41746-024-01166-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 06/12/2024] [Indexed: 06/29/2024] Open
Abstract
Foundation models are transforming artificial intelligence (AI) in healthcare by providing modular components adaptable for various downstream tasks, making AI development more scalable and cost-effective. Foundation models for structured electronic health records (EHR), trained on coded medical records from millions of patients, demonstrated benefits including increased performance with fewer training labels, and improved robustness to distribution shifts. However, questions remain on the feasibility of sharing these models across hospitals and their performance in local tasks. This multi-center study examined the adaptability of a publicly accessible structured EHR foundation model (FMSM), trained on 2.57 M patient records from Stanford Medicine. Experiments used EHR data from The Hospital for Sick Children (SickKids) and Medical Information Mart for Intensive Care (MIMIC-IV). We assessed both adaptability via continued pretraining on local data, and task adaptability compared to baselines of locally training models from scratch, including a local foundation model. Evaluations on 8 clinical prediction tasks showed that adapting the off-the-shelf FMSM matched the performance of gradient boosting machines (GBM) locally trained on all data while providing a 13% improvement in settings with few task-specific training labels. Continued pretraining on local data showed FMSM required fewer than 1% of training examples to match the fully trained GBM's performance, and was 60 to 90% more sample-efficient than training local foundation models from scratch. Our findings demonstrate that adapting EHR foundation models across hospitals provides improved prediction performance at less cost, underscoring the utility of base foundation models as modular components to streamline the development of healthcare AI.
Collapse
Affiliation(s)
- Lin Lawrence Guo
- Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, ON, Canada
| | - Jason Fries
- Stanford Center for Biomedical Informatics Research, Stanford University, Palo Alto, CA, USA
| | - Ethan Steinberg
- Stanford Center for Biomedical Informatics Research, Stanford University, Palo Alto, CA, USA
| | - Scott Lanyon Fleming
- Stanford Center for Biomedical Informatics Research, Stanford University, Palo Alto, CA, USA
| | - Keith Morse
- Division of Pediatric Hospital Medicine, Department of Pediatrics, Stanford University, Palo Alto, CA, USA
| | - Catherine Aftandilian
- Division of Hematology/Oncology, Department of Pediatrics, Stanford University, Palo Alto, CA, USA
| | - Jose Posada
- Universidad del Norte, Barranquilla, Colombia
| | - Nigam Shah
- Stanford Center for Biomedical Informatics Research, Stanford University, Palo Alto, CA, USA
| | - Lillian Sung
- Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, ON, Canada.
- Division of Haematology/Oncology, The Hospital for Sick Children, Toronto, ON, Canada.
| |
Collapse
|
6
|
Fridgeirsson EA, Williams R, Rijnbeek P, Suchard MA, Reps JM. Comparing penalization methods for linear models on large observational health data. J Am Med Inform Assoc 2024; 31:1514-1521. [PMID: 38767857 PMCID: PMC11187433 DOI: 10.1093/jamia/ocae109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 04/19/2024] [Accepted: 05/06/2024] [Indexed: 05/22/2024] Open
Abstract
OBJECTIVE This study evaluates regularization variants in logistic regression (L1, L2, ElasticNet, Adaptive L1, Adaptive ElasticNet, Broken adaptive ridge [BAR], and Iterative hard thresholding [IHT]) for discrimination and calibration performance, focusing on both internal and external validation. MATERIALS AND METHODS We use data from 5 US claims and electronic health record databases and develop models for various outcomes in a major depressive disorder patient population. We externally validate all models in the other databases. We use a train-test split of 75%/25% and evaluate performance with discrimination and calibration. Statistical analysis for difference in performance uses Friedman's test and critical difference diagrams. RESULTS Of the 840 models we develop, L1 and ElasticNet emerge as superior in both internal and external discrimination, with a notable AUC difference. BAR and IHT show the best internal calibration, without a clear external calibration leader. ElasticNet typically has larger model sizes than L1. Methods like IHT and BAR, while slightly less discriminative, significantly reduce model complexity. CONCLUSION L1 and ElasticNet offer the best discriminative performance in logistic regression for healthcare predictions, maintaining robustness across validations. For simpler, more interpretable models, L0-based methods (IHT and BAR) are advantageous, providing greater parsimony and calibration with fewer features. This study aids in selecting suitable regularization techniques for healthcare prediction models, balancing performance, complexity, and interpretability.
Collapse
Affiliation(s)
- Egill A Fridgeirsson
- Department of Medical Informatics, Erasmus University Medical Center, 3015 GD Rotterdam, The Netherlands
| | - Ross Williams
- Department of Medical Informatics, Erasmus University Medical Center, 3015 GD Rotterdam, The Netherlands
| | - Peter Rijnbeek
- Department of Medical Informatics, Erasmus University Medical Center, 3015 GD Rotterdam, The Netherlands
| | - Marc A Suchard
- Department of Biostatistics, University of California, Los Angeles, Los Angeles, CA 90095-1772, United States
- VA Informatics and Computing Infrastructure, United States Department of Veterans Affairs, Salt Lake City, UT 84148, United States
| | - Jenna M Reps
- Department of Medical Informatics, Erasmus University Medical Center, 3015 GD Rotterdam, The Netherlands
- Observational Health Data Analytics, Janssen Research and Development, Titusville, NJ 08560, United States
| |
Collapse
|
7
|
Khan SD, Hoodbhoy Z, Raja MHR, Kim JY, Hogg HDJ, Manji AAA, Gulamali F, Hasan A, Shaikh A, Tajuddin S, Khan NS, Patel MR, Balu S, Samad Z, Sendak MP. Frameworks for procurement, integration, monitoring, and evaluation of artificial intelligence tools in clinical settings: A systematic review. PLOS DIGITAL HEALTH 2024; 3:e0000514. [PMID: 38809946 PMCID: PMC11135672 DOI: 10.1371/journal.pdig.0000514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 04/18/2024] [Indexed: 05/31/2024]
Abstract
Research on the applications of artificial intelligence (AI) tools in medicine has increased exponentially over the last few years but its implementation in clinical practice has not seen a commensurate increase with a lack of consensus on implementing and maintaining such tools. This systematic review aims to summarize frameworks focusing on procuring, implementing, monitoring, and evaluating AI tools in clinical practice. A comprehensive literature search, following PRSIMA guidelines was performed on MEDLINE, Wiley Cochrane, Scopus, and EBSCO databases, to identify and include articles recommending practices, frameworks or guidelines for AI procurement, integration, monitoring, and evaluation. From the included articles, data regarding study aim, use of a framework, rationale of the framework, details regarding AI implementation involving procurement, integration, monitoring, and evaluation were extracted. The extracted details were then mapped on to the Donabedian Plan, Do, Study, Act cycle domains. The search yielded 17,537 unique articles, out of which 47 were evaluated for inclusion based on their full texts and 25 articles were included in the review. Common themes extracted included transparency, feasibility of operation within existing workflows, integrating into existing workflows, validation of the tool using predefined performance indicators and improving the algorithm and/or adjusting the tool to improve performance. Among the four domains (Plan, Do, Study, Act) the most common domain was Plan (84%, n = 21), followed by Study (60%, n = 15), Do (52%, n = 13), & Act (24%, n = 6). Among 172 authors, only 1 (0.6%) was from a low-income country (LIC) and 2 (1.2%) were from lower-middle-income countries (LMICs). Healthcare professionals cite the implementation of AI tools within clinical settings as challenging owing to low levels of evidence focusing on integration in the Do and Act domains. The current healthcare AI landscape calls for increased data sharing and knowledge translation to facilitate common goals and reap maximum clinical benefit.
Collapse
Affiliation(s)
- Sarim Dawar Khan
- CITRIC Health Data Science Centre, Department of Medicine, Aga Khan University, Karachi, Pakistan
| | - Zahra Hoodbhoy
- CITRIC Health Data Science Centre, Department of Medicine, Aga Khan University, Karachi, Pakistan
- Department of Paediatrics and Child Health, Aga Khan University, Karachi, Pakistan
| | | | - Jee Young Kim
- Duke Institute for Health Innovation, Duke University School of Medicine, Durham, North Carolina, United States
| | - Henry David Jeffry Hogg
- Population Health Science Institute, Newcastle University, Newcastle upon Tyne, United Kingdom
- Newcastle upon Tyne Hospitals NHS Foundation Trust, Newcastle upon Tyne, United Kingdom
- Moorfields Eye Hospital NHS Foundation Trust, London, United Kingdom
| | - Afshan Anwar Ali Manji
- CITRIC Health Data Science Centre, Department of Medicine, Aga Khan University, Karachi, Pakistan
| | - Freya Gulamali
- Duke Institute for Health Innovation, Duke University School of Medicine, Durham, North Carolina, United States
| | - Alifia Hasan
- Duke Institute for Health Innovation, Duke University School of Medicine, Durham, North Carolina, United States
| | - Asim Shaikh
- CITRIC Health Data Science Centre, Department of Medicine, Aga Khan University, Karachi, Pakistan
| | - Salma Tajuddin
- CITRIC Health Data Science Centre, Department of Medicine, Aga Khan University, Karachi, Pakistan
| | - Nida Saddaf Khan
- CITRIC Health Data Science Centre, Department of Medicine, Aga Khan University, Karachi, Pakistan
| | - Manesh R. Patel
- Duke Clinical Research Institute, Duke University School of Medicine, Durham, North Carolina, United States
- Division of Cardiology, Duke University School of Medicine, Durham, North Carolina, United States
| | - Suresh Balu
- Duke Institute for Health Innovation, Duke University School of Medicine, Durham, North Carolina, United States
| | - Zainab Samad
- CITRIC Health Data Science Centre, Department of Medicine, Aga Khan University, Karachi, Pakistan
- Department of Medicine, Aga Khan University, Karachi, Pakistan
| | - Mark P. Sendak
- Duke Institute for Health Innovation, Duke University School of Medicine, Durham, North Carolina, United States
| |
Collapse
|
8
|
Naderalvojoud B, Curtin CM, Yanover C, El-Hay T, Choi B, Park RW, Tabuenca JG, Reeve MP, Falconer T, Humphreys K, Asch SM, Hernandez-Boussard T. Towards global model generalizability: independent cross-site feature evaluation for patient-level risk prediction models using the OHDSI network. J Am Med Inform Assoc 2024; 31:1051-1061. [PMID: 38412331 PMCID: PMC11031239 DOI: 10.1093/jamia/ocae028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Revised: 01/26/2024] [Accepted: 02/01/2024] [Indexed: 02/29/2024] Open
Abstract
BACKGROUND Predictive models show promise in healthcare, but their successful deployment is challenging due to limited generalizability. Current external validation often focuses on model performance with restricted feature use from the original training data, lacking insights into their suitability at external sites. Our study introduces an innovative methodology for evaluating features during both the development phase and the validation, focusing on creating and validating predictive models for post-surgery patient outcomes with improved generalizability. METHODS Electronic health records (EHRs) from 4 countries (United States, United Kingdom, Finland, and Korea) were mapped to the OMOP Common Data Model (CDM), 2008-2019. Machine learning (ML) models were developed to predict post-surgery prolonged opioid use (POU) risks using data collected 6 months before surgery. Both local and cross-site feature selection methods were applied in the development and external validation datasets. Models were developed using Observational Health Data Sciences and Informatics (OHDSI) tools and validated on separate patient cohorts. RESULTS Model development included 41 929 patients, 14.6% with POU. The external validation included 31 932 (UK), 23 100 (US), 7295 (Korea), and 3934 (Finland) patients with POU of 44.2%, 22.0%, 15.8%, and 21.8%, respectively. The top-performing model, Lasso logistic regression, achieved an area under the receiver operating characteristic curve (AUROC) of 0.75 during local validation and 0.69 (SD = 0.02) (averaged) in external validation. Models trained with cross-site feature selection significantly outperformed those using only features from the development site through external validation (P < .05). CONCLUSIONS Using EHRs across four countries mapped to the OMOP CDM, we developed generalizable predictive models for POU. Our approach demonstrates the significant impact of cross-site feature selection in improving model performance, underscoring the importance of incorporating diverse feature sets from various clinical settings to enhance the generalizability and utility of predictive healthcare models.
Collapse
Affiliation(s)
| | - Catherine M Curtin
- Department of Surgery, Veterans Affairs Palo Alto Health Care System, Palo Alto, CA 94304, United States
| | - Chen Yanover
- KI Research Institute, Kfar Malal, 4592000, Israel
| | - Tal El-Hay
- KI Research Institute, Kfar Malal, 4592000, Israel
| | - Byungjin Choi
- Department of Biomedical Informatics, Ajou University Graduate School of Medicine, Suwon, 16499, Korea
| | - Rae Woong Park
- Department of Biomedical Informatics, Ajou University Graduate School of Medicine, Suwon, 16499, Korea
| | - Javier Gracia Tabuenca
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, 00014, Finland
| | - Mary Pat Reeve
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, 00014, Finland
| | - Thomas Falconer
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, United States
| | - Keith Humphreys
- Department of Psychiatry and the Behavioral Sciences, Stanford University, Stanford, CA 94305, United States
- Center for Innovation to Implementation, Veterans Affairs Palo Alto Health Care System, Palo Alto, CA 94304, United States
| | - Steven M Asch
- Department of Medicine, Stanford University, Stanford, CA 94305, United States
- Center for Innovation to Implementation, Veterans Affairs Palo Alto Health Care System, Palo Alto, CA 94304, United States
| | | |
Collapse
|
9
|
Guo LL, Morse KE, Aftandilian C, Steinberg E, Fries J, Posada J, Fleming SL, Lemmon J, Jessa K, Shah N, Sung L. Characterizing the limitations of using diagnosis codes in the context of machine learning for healthcare. BMC Med Inform Decis Mak 2024; 24:51. [PMID: 38355486 PMCID: PMC10868117 DOI: 10.1186/s12911-024-02449-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 01/30/2024] [Indexed: 02/16/2024] Open
Abstract
BACKGROUND Diagnostic codes are commonly used as inputs for clinical prediction models, to create labels for prediction tasks, and to identify cohorts for multicenter network studies. However, the coverage rates of diagnostic codes and their variability across institutions are underexplored. The primary objective was to describe lab- and diagnosis-based labels for 7 selected outcomes at three institutions. Secondary objectives were to describe agreement, sensitivity, and specificity of diagnosis-based labels against lab-based labels. METHODS This study included three cohorts: SickKids from The Hospital for Sick Children, and StanfordPeds and StanfordAdults from Stanford Medicine. We included seven clinical outcomes with lab-based definitions: acute kidney injury, hyperkalemia, hypoglycemia, hyponatremia, anemia, neutropenia and thrombocytopenia. For each outcome, we created four lab-based labels (abnormal, mild, moderate and severe) based on test result and one diagnosis-based label. Proportion of admissions with a positive label were presented for each outcome stratified by cohort. Using lab-based labels as the gold standard, agreement using Cohen's Kappa, sensitivity and specificity were calculated for each lab-based severity level. RESULTS The number of admissions included were: SickKids (n = 59,298), StanfordPeds (n = 24,639) and StanfordAdults (n = 159,985). The proportion of admissions with a positive diagnosis-based label was significantly higher for StanfordPeds compared to SickKids across all outcomes, with odds ratio (99.9% confidence interval) for abnormal diagnosis-based label ranging from 2.2 (1.7-2.7) for neutropenia to 18.4 (10.1-33.4) for hyperkalemia. Lab-based labels were more similar by institution. When using lab-based labels as the gold standard, Cohen's Kappa and sensitivity were lower at SickKids for all severity levels compared to StanfordPeds. CONCLUSIONS Across multiple outcomes, diagnosis codes were consistently different between the two pediatric institutions. This difference was not explained by differences in test results. These results may have implications for machine learning model development and deployment.
Collapse
Affiliation(s)
- Lin Lawrence Guo
- Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, ON, Canada
| | - Keith E Morse
- Division of Pediatric Hospital Medicine, Department of Pediatrics, Stanford University, Palo Alto, CA, USA
| | - Catherine Aftandilian
- Division of Hematology/Oncology, Department of Pediatrics, Stanford University, Palo Alto, CA, USA
| | - Ethan Steinberg
- Stanford Center for Biomedical Informatics Research, Stanford University, Palo Alto, CA, USA
| | - Jason Fries
- Stanford Center for Biomedical Informatics Research, Stanford University, Palo Alto, CA, USA
| | - Jose Posada
- Universidad del Norte, Barranquilla, Colombia
| | - Scott Lanyon Fleming
- Stanford Center for Biomedical Informatics Research, Stanford University, Palo Alto, CA, USA
| | - Joshua Lemmon
- Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, ON, Canada
| | - Karim Jessa
- Information Services, The Hospital for Sick Children, Toronto, ON, Canada
| | - Nigam Shah
- Stanford Center for Biomedical Informatics Research, Stanford University, Palo Alto, CA, USA
| | - Lillian Sung
- Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, ON, Canada.
- Division of Haematology/Oncology, The Hospital for Sick Children, 555 University Avenue, M5G1X8, Toronto, ON, Canada.
| |
Collapse
|
10
|
Ahmadi N, Nguyen QV, Sedlmayr M, Wolfien M. A comparative patient-level prediction study in OMOP CDM: applicative potential and insights from synthetic data. Sci Rep 2024; 14:2287. [PMID: 38280887 PMCID: PMC10821926 DOI: 10.1038/s41598-024-52723-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Accepted: 01/23/2024] [Indexed: 01/29/2024] Open
Abstract
The emergence of collaborations, which standardize and combine multiple clinical databases across different regions, provide a wealthy source of data, which is fundamental for clinical prediction models, such as patient-level predictions. With the aid of such large data pools, researchers are able to develop clinical prediction models for improved disease classification, risk assessment, and beyond. To fully utilize this potential, Machine Learning (ML) methods are commonly required to process these large amounts of data on disease-specific patient cohorts. As a consequence, the Observational Health Data Sciences and Informatics (OHDSI) collaborative develops a framework to facilitate the application of ML models for these standardized patient datasets by using the Observational Medical Outcomes Partnership (OMOP) common data model (CDM). In this study, we compare the feasibility of current web-based OHDSI approaches, namely ATLAS and "Patient-level Prediction" (PLP), against a native solution (R based) to conduct such ML-based patient-level prediction analyses in OMOP. This will enable potential users to select the most suitable approach for their investigation. Each of the applied ML solutions was individually utilized to solve the same patient-level prediction task. Both approaches went through an exemplary benchmarking analysis to assess the weaknesses and strengths of the PLP R-Package. In this work, the performance of this package was subsequently compared versus the commonly used native R-package called Machine Learning in R 3 (mlr3), and its sub-packages. The approaches were evaluated on performance, execution time, and ease of model implementation. The results show that the PLP package has shorter execution times, which indicates great scalability, as well as intuitive code implementation, and numerous possibilities for visualization. However, limitations in comparison to native packages were depicted in the implementation of specific ML classifiers (e.g., Lasso), which may result in a decreased performance for real-world prediction problems. The findings here contribute to the overall effort of developing ML-based prediction models on a clinical scale and provide a snapshot for future studies that explicitly aim to develop patient-level prediction models in OMOP CDM.
Collapse
Affiliation(s)
- Najia Ahmadi
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, 01307, Dresden, Germany.
| | - Quang Vu Nguyen
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, 01307, Dresden, Germany
| | - Martin Sedlmayr
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, 01307, Dresden, Germany
| | - Markus Wolfien
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, 01307, Dresden, Germany
- Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI), Dresden, Germany
| |
Collapse
|
11
|
SCHUEMIE M, REPS J, BLACK A, DeFALCO F, EVANS L, FRIDGEIRSSON E, GILBERT JP, KNOLL C, LAVALLEE M, RAO GA, RIJNBEEK P, SADOWSKI K, SENA A, SWERDEL J, WILLIAMS RD, SUCHARD M. Health-Analytics Data to Evidence Suite (HADES): Open-Source Software for Observational Research. Stud Health Technol Inform 2024; 310:966-970. [PMID: 38269952 PMCID: PMC10868467 DOI: 10.3233/shti231108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2024]
Abstract
The Health-Analytics Data to Evidence Suite (HADES) is an open-source software collection developed by Observational Health Data Sciences and Informatics (OHDSI). It executes directly against healthcare data such as electronic health records and administrative claims, that have been converted to the Observational Medical Outcomes Partnership (OMOP) Common Data Model. Using advanced analytics, HADES performs characterization, population-level causal effect estimation, and patient-level prediction, potentially across a federated data network, allowing patient-level data to remain locally while only aggregated statistics are shared. Designed to run across a wide array of technical environments, including different operating systems and database platforms, HADES uses continuous integration with a large set of unit tests to maintain reliability. HADES implements OHDSI best practices, and is used in almost all published OHDSI studies, including some that have directly informed regulatory decisions.
Collapse
Affiliation(s)
- Martijn SCHUEMIE
- Observational Health Data Science and Informatics, New York, NY, USA
- Observational Health Data Analytics, Johnson & Johnson, Titusville, NJ, USA
- Department of Biostatistics, UCLA, Los Angeles, CA, USA
| | - Jenna REPS
- Observational Health Data Science and Informatics, New York, NY, USA
- Observational Health Data Analytics, Johnson & Johnson, Titusville, NJ, USA
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Adam BLACK
- Observational Health Data Science and Informatics, New York, NY, USA
- Odysseus Data Services Inc., Cambridge, MA, USA
| | - Frank DeFALCO
- Observational Health Data Science and Informatics, New York, NY, USA
- Observational Health Data Analytics, Johnson & Johnson, Titusville, NJ, USA
| | - Lee EVANS
- Observational Health Data Science and Informatics, New York, NY, USA
- LTS Computing LLC, West Chester, PA, USA
| | - Egill FRIDGEIRSSON
- Observational Health Data Science and Informatics, New York, NY, USA
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - James P. GILBERT
- Observational Health Data Science and Informatics, New York, NY, USA
- Observational Health Data Analytics, Johnson & Johnson, Titusville, NJ, USA
| | - Chris KNOLL
- Observational Health Data Science and Informatics, New York, NY, USA
- Observational Health Data Analytics, Johnson & Johnson, Titusville, NJ, USA
| | - Martin LAVALLEE
- Observational Health Data Science and Informatics, New York, NY, USA
- Virginia Commonwealth University, Richmond, VA, USA
| | - Gowtham A. RAO
- Observational Health Data Science and Informatics, New York, NY, USA
- Observational Health Data Analytics, Johnson & Johnson, Titusville, NJ, USA
| | - Peter RIJNBEEK
- Observational Health Data Science and Informatics, New York, NY, USA
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Katy SADOWSKI
- Observational Health Data Science and Informatics, New York, NY, USA
- TrialSpark Inc., New York, NY, USA
| | - Anthony SENA
- Observational Health Data Science and Informatics, New York, NY, USA
- Observational Health Data Analytics, Johnson & Johnson, Titusville, NJ, USA
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Joel SWERDEL
- Observational Health Data Science and Informatics, New York, NY, USA
- Observational Health Data Analytics, Johnson & Johnson, Titusville, NJ, USA
| | - Ross D. WILLIAMS
- Observational Health Data Science and Informatics, New York, NY, USA
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Marc SUCHARD
- Observational Health Data Science and Informatics, New York, NY, USA
- Department of Biostatistics, UCLA, Los Angeles, CA, USA
- VA Informatics and Computing Infrastructure, Department of Veterans Affairs, Salt Lake City, UT, USA
| |
Collapse
|
12
|
Naderalvojoud B, Hernandez-Boussard T. Improving machine learning with ensemble learning on observational healthcare data. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2024; 2023:521-529. [PMID: 38222353 PMCID: PMC10785929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Abstract
Ensemble learning is a powerful technique for improving the accuracy and reliability of prediction models, especially in scenarios where individual models may not perform well. However, combining models with varying accuracies may not always improve the final prediction results, as models with lower accuracies may obscure the results of models with higher accuracies. This paper addresses this issue and answers the question of when an ensemble approach outperforms individual models for prediction. As a result, we propose an ensemble model for predicting patients at risk of postoperative prolonged opioid. The model incorporates two machine learning models that are trained using different covariates, resulting in high precision and recall. Our study, which employs five different machine learning algorithms, shows that the proposed approach significantly improves the final prediction results in terms of AUROC and AUPRC.
Collapse
Affiliation(s)
- Behzad Naderalvojoud
- Department of Medicine, Biomedical Informatics, Stanford University, Stanford, CA, USA
| | | |
Collapse
|
13
|
Fridgeirsson EA, Sontag D, Rijnbeek P. Attention-based neural networks for clinical prediction modelling on electronic health records. BMC Med Res Methodol 2023; 23:285. [PMID: 38062352 PMCID: PMC10701944 DOI: 10.1186/s12874-023-02112-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 11/27/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND Deep learning models have had a lot of success in various fields. However, on structured data they have struggled. Here we apply four state-of-the-art supervised deep learning models using the attention mechanism and compare against logistic regression and XGBoost using discrimination, calibration and clinical utility. METHODS We develop the models using a general practitioners database. We implement a recurrent neural network, a transformer with and without reverse distillation and a graph neural network. We measure discrimination using the area under the receiver operating characteristic curve (AUC) and the area under the precision recall curve (AUPRC). We assess smooth calibration using restricted cubic splines and clinical utility with decision curve analysis. RESULTS Our results show that deep learning approaches can improve discrimination up to 2.5% points AUC and 7.4% points AUPRC. However, on average the baselines are competitive. Most models are similarly calibrated as the baselines except for the graph neural network. The transformer using reverse distillation shows the best performance in clinical utility on two out of three prediction problems over most of the prediction thresholds. CONCLUSION In this study, we evaluated various approaches in supervised learning using neural networks and attention. Here we do a rigorous comparison, not only looking at discrimination but also calibration and clinical utility. There is value in using deep learning models on electronic health record data since it can improve discrimination and clinical utility while providing good calibration. However, good baseline methods are still competitive.
Collapse
Affiliation(s)
- Egill A Fridgeirsson
- Department of Medical Informatics, Erasmus University Medical Center, Doctor Molewaterplein 40, 3015 GD, Rotterdam, the Netherlands.
| | - David Sontag
- Institute for Medical Engineering & Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Peter Rijnbeek
- Department of Medical Informatics, Erasmus University Medical Center, Doctor Molewaterplein 40, 3015 GD, Rotterdam, the Netherlands
| |
Collapse
|
14
|
Seinen TM, Kors JA, van Mulligen EM, Fridgeirsson E, Rijnbeek PR. The added value of text from Dutch general practitioner notes in predictive modeling. J Am Med Inform Assoc 2023; 30:1973-1984. [PMID: 37587084 PMCID: PMC10654855 DOI: 10.1093/jamia/ocad160] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 07/06/2023] [Accepted: 08/07/2023] [Indexed: 08/18/2023] Open
Abstract
OBJECTIVE This work aims to explore the value of Dutch unstructured data, in combination with structured data, for the development of prognostic prediction models in a general practitioner (GP) setting. MATERIALS AND METHODS We trained and validated prediction models for 4 common clinical prediction problems using various sparse text representations, common prediction algorithms, and observational GP electronic health record (EHR) data. We trained and validated 84 models internally and externally on data from different EHR systems. RESULTS On average, over all the different text representations and prediction algorithms, models only using text data performed better or similar to models using structured data alone in 2 prediction tasks. Additionally, in these 2 tasks, the combination of structured and text data outperformed models using structured or text data alone. No large performance differences were found between the different text representations and prediction algorithms. DISCUSSION Our findings indicate that the use of unstructured data alone can result in well-performing prediction models for some clinical prediction problems. Furthermore, the performance improvement achieved by combining structured and text data highlights the added value. Additionally, we demonstrate the significance of clinical natural language processing research in languages other than English and the possibility of validating text-based prediction models across various EHR systems. CONCLUSION Our study highlights the potential benefits of incorporating unstructured data in clinical prediction models in a GP setting. Although the added value of unstructured data may vary depending on the specific prediction task, our findings suggest that it has the potential to enhance patient care.
Collapse
Affiliation(s)
- Tom M Seinen
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Jan A Kors
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Erik M van Mulligen
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Egill Fridgeirsson
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Peter R Rijnbeek
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| |
Collapse
|
15
|
Choi JY, Yoo S, Song W, Kim S, Baek H, Lee JS, Yoon YS, Yoon S, Lee HY, Kim KI. Development and Validation of a Prognostic Classification Model Predicting Postoperative Adverse Outcomes in Older Surgical Patients Using a Machine Learning Algorithm: Retrospective Observational Network Study. J Med Internet Res 2023; 25:e42259. [PMID: 37955965 PMCID: PMC10682929 DOI: 10.2196/42259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 03/08/2023] [Accepted: 10/11/2023] [Indexed: 11/14/2023] Open
Abstract
BACKGROUND Older adults are at an increased risk of postoperative morbidity. Numerous risk stratification tools exist, but effort and manpower are required. OBJECTIVE This study aimed to develop a predictive model of postoperative adverse outcomes in older patients following general surgery with an open-source, patient-level prediction from the Observational Health Data Sciences and Informatics for internal and external validation. METHODS We used the Observational Medical Outcomes Partnership common data model and machine learning algorithms. The primary outcome was a composite of 90-day postoperative all-cause mortality and emergency department visits. Secondary outcomes were postoperative delirium, prolonged postoperative stay (≥75th percentile), and prolonged hospital stay (≥21 days). An 80% versus 20% split of the data from the Seoul National University Bundang Hospital (SNUBH) and Seoul National University Hospital (SNUH) common data model was used for model training and testing versus external validation. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC) with a 95% CI. RESULTS Data from 27,197 (SNUBH) and 32,857 (SNUH) patients were analyzed. Compared to the random forest, Adaboost, and decision tree models, the least absolute shrinkage and selection operator logistic regression model showed good internal discriminative accuracy (internal AUC 0.723, 95% CI 0.701-0.744) and transportability (external AUC 0.703, 95% CI 0.692-0.714) for the primary outcome. The model also possessed good internal and external AUCs for postoperative delirium (internal AUC 0.754, 95% CI 0.713-0.794; external AUC 0.750, 95% CI 0.727-0.772), prolonged postoperative stay (internal AUC 0.813, 95% CI 0.800-0.825; external AUC 0.747, 95% CI 0.741-0.753), and prolonged hospital stay (internal AUC 0.770, 95% CI 0.749-0.792; external AUC 0.707, 95% CI 0.696-0.718). Compared with age or the Charlson comorbidity index, the model showed better prediction performance. CONCLUSIONS The derived model shall assist clinicians and patients in understanding the individualized risks and benefits of surgery.
Collapse
Affiliation(s)
- Jung-Yeon Choi
- Departmentof Internal Medicine, Seoul National University Bundang Hospital, Seongnam-si, Republic of Korea
| | - Sooyoung Yoo
- Office of eHealth Research and Business, Seoul National University Bundang Hospital, Seongnam-si, Republic of Korea
| | - Wongeun Song
- Office of eHealth Research and Business, Seoul National University Bundang Hospital, Seongnam-si, Republic of Korea
- Department of Health Science and Technology, Graduate School of Convergence Science and Technology, Seoul National University, Seongnam-si, Republic of Korea
| | - Seok Kim
- Office of eHealth Research and Business, Seoul National University Bundang Hospital, Seongnam-si, Republic of Korea
| | - Hyunyoung Baek
- Office of eHealth Research and Business, Seoul National University Bundang Hospital, Seongnam-si, Republic of Korea
| | - Jun Suh Lee
- Department of Surgery, G Sam Hospital, Gunpo, Republic of Korea
| | - Yoo-Seok Yoon
- Department of Surgery, Seoul National University Bundang Hospital, Seongnam-si, Republic of Korea
- Department of Surgery, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Seonghae Yoon
- Department of Clinical Pharmacology and Therapeutic, Seoul National University Bundang Hospital, Seongnam-si, Republic of Korea
| | - Hae-Young Lee
- Department of Internal Medicine, Seoul National University Hospital, Seoul, Republic of Korea
- Department of Internal Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Kwang-Il Kim
- Departmentof Internal Medicine, Seoul National University Bundang Hospital, Seongnam-si, Republic of Korea
- Department of Internal Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
16
|
Vagliano I, Dormosh N, Rios M, Luik TT, Buonocore TM, Elbers PWG, Dongelmans DA, Schut MC, Abu-Hanna A. Prognostic models of in-hospital mortality of intensive care patients using neural representation of unstructured text: A systematic review and critical appraisal. J Biomed Inform 2023; 146:104504. [PMID: 37742782 DOI: 10.1016/j.jbi.2023.104504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 08/29/2023] [Accepted: 09/21/2023] [Indexed: 09/26/2023]
Abstract
OBJECTIVE To review and critically appraise published and preprint reports of prognostic models of in-hospital mortality of patients in the intensive-care unit (ICU) based on neural representations (embeddings) of clinical notes. METHODS PubMed and arXiv were searched up to August 1, 2022. At least two reviewers independently selected the studies that developed a prognostic model of in-hospital mortality of intensive-care patients using free-text represented as embeddings and extracted data using the CHARMS checklist. Risk of bias was assessed using PROBAST. Reporting on the model was assessed with the TRIPOD guideline. To assess the machine learning components that were used in the models, we present a new descriptive framework based on different techniques to represent text and provide predictions from text. The study protocol was registered in the PROSPERO database (CRD42022354602). RESULTS Eighteen studies out of 2,825 were included. All studies used the publicly-available MIMIC dataset. Context-independent word embeddings are widely used. Model discrimination was provided by all studies (AUROC 0.75-0.96), but measures of calibration were scarce. Seven studies used both structural clinical variables and notes. Model discrimination improved when adding clinical notes to variables. None of the models was externally validated and often a simple train/test split was used for internal validation. Our critical appraisal demonstrated a high risk of bias in all studies and concerns regarding their applicability in clinical practice. CONCLUSION All studies used a neural architecture for prediction and were based on one publicly available dataset. Clinical notes were reported to improve predictive performance when used in addition to only clinical variables. Most studies had methodological, reporting, and applicability issues. We recommend reporting both model discrimination and calibration, using additional data sources, and using more robust evaluation strategies, including prospective and external validation. Finally, sharing data and code is encouraged to improve study reproducibility.
Collapse
Affiliation(s)
- I Vagliano
- Dept. of Medical Informatics, Amsterdam UMC, University of Amsterdam, Amsterdam, the Netherlands; Amsterdam Public Health (APH), Amsterdam, the Netherlands.
| | - N Dormosh
- Dept. of Medical Informatics, Amsterdam UMC, University of Amsterdam, Amsterdam, the Netherlands; Amsterdam Public Health (APH), Amsterdam, the Netherlands
| | - M Rios
- Centre for Translation Studies, University of Vienna, Vienna, Austria. https://twitter.com/zhizhid
| | - T T Luik
- Amsterdam Public Health (APH), Amsterdam, the Netherlands; Dept. of Medical Biology, Amsterdam UMC, University of Amsterdam, Amsterdam, the Netherlands
| | - T M Buonocore
- Dept. of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
| | - P W G Elbers
- Amsterdam Public Health (APH), Amsterdam, the Netherlands; Dept. of Intensive Care Medicine, Center for Critical Care Computational Intelligence (C4I), Amsterdam Medical Data Science (AMDS), Amsterdam Institute for Infection and Immunity (AII), Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands. https://twitter.com/zhizhid
| | - D A Dongelmans
- Amsterdam Public Health (APH), Amsterdam, the Netherlands; National Intensive Care Evaluation (NICE) Foundation, Amsterdam, the Netherlands; Dept. of Intensive Care Medicine, Amsterdam UMC, University of Amsterdam, Amsterdam, the Netherlands
| | - M C Schut
- Dept. of Medical Informatics, Amsterdam UMC, University of Amsterdam, Amsterdam, the Netherlands; Amsterdam Public Health (APH), Amsterdam, the Netherlands; Dept. of Clinical Chemistry, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
| | - A Abu-Hanna
- Dept. of Medical Informatics, Amsterdam UMC, University of Amsterdam, Amsterdam, the Netherlands; Amsterdam Public Health (APH), Amsterdam, the Netherlands
| |
Collapse
|
17
|
Pungitore S, Subbian V. Assessment of Prediction Tasks and Time Window Selection in Temporal Modeling of Electronic Health Record Data: a Systematic Review. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2023; 7:313-331. [PMID: 37637723 PMCID: PMC10449760 DOI: 10.1007/s41666-023-00143-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 04/12/2023] [Accepted: 07/28/2023] [Indexed: 08/29/2023]
Abstract
Temporal electronic health record (EHR) data are often preferred for clinical prediction tasks because they offer more complete representations of a patient's pathophysiology than static data. A challenge when working with temporal EHR data is problem formulation, which includes defining the time windows of interest and the prediction task. Our objective was to conduct a systematic review that assessed the definition and reporting of concepts relevant to temporal clinical prediction tasks. We searched PubMed® and IEEE Xplore® databases for studies from January 1, 2010 applying machine learning models to EHR data for patient outcome prediction. Publications applying time-series methods were selected for further review. We identified 92 studies and summarized them by clinical context and definition and reporting of the prediction problem. For the time windows of interest, 12 studies did not discuss window lengths, 57 used a single set of window lengths, and 23 evaluated the relationship between window length and model performance. We also found that 72 studies had appropriate reporting of the prediction task. However, evaluation of prediction problem formulation for temporal EHR data was complicated by heterogeneity in assessing and reporting of these concepts. Even among studies modeling similar clinical outcomes, there were variations in terminology used to describe the prediction problem, rationale for window lengths, and determination of the outcome of interest. As temporal modeling using EHR data expands, minimal reporting standards should include time-series specific concerns to promote rigor and reproducibility in future studies and facilitate model implementation in clinical settings. Supplementary Information The online version contains supplementary material available at 10.1007/s41666-023-00143-4.
Collapse
Affiliation(s)
- Sarah Pungitore
- Program in Applied Mathematics, Department of Mathematics, 617 N Santa Rita Ave, Tucson, AZ 85721 USA
| | - Vignesh Subbian
- Department of Biomedical Engineering, The University of Arizona, Tucson, AZ 85721-0020 USA
- Department of Systems and Industrial Engineering, The University of Arizona, Tucson, AZ 85721-0020 USA
| |
Collapse
|
18
|
Wolfien M, Ahmadi N, Fitzer K, Grummt S, Heine KL, Jung IC, Krefting D, Kühn A, Peng Y, Reinecke I, Scheel J, Schmidt T, Schmücker P, Schüttler C, Waltemath D, Zoch M, Sedlmayr M. Ten Topics to Get Started in Medical Informatics Research. J Med Internet Res 2023; 25:e45948. [PMID: 37486754 PMCID: PMC10407648 DOI: 10.2196/45948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 03/29/2023] [Accepted: 04/11/2023] [Indexed: 07/25/2023] Open
Abstract
The vast and heterogeneous data being constantly generated in clinics can provide great wealth for patients and research alike. The quickly evolving field of medical informatics research has contributed numerous concepts, algorithms, and standards to facilitate this development. However, these difficult relationships, complex terminologies, and multiple implementations can present obstacles for people who want to get active in the field. With a particular focus on medical informatics research conducted in Germany, we present in our Viewpoint a set of 10 important topics to improve the overall interdisciplinary communication between different stakeholders (eg, physicians, computational experts, experimentalists, students, patient representatives). This may lower the barriers to entry and offer a starting point for collaborations at different levels. The suggested topics are briefly introduced, then general best practice guidance is given, and further resources for in-depth reading or hands-on tutorials are recommended. In addition, the topics are set to cover current aspects and open research gaps of the medical informatics domain, including data regulations and concepts; data harmonization and processing; and data evaluation, visualization, and dissemination. In addition, we give an example on how these topics can be integrated in a medical informatics curriculum for higher education. By recognizing these topics, readers will be able to (1) set clinical and research data into the context of medical informatics, understanding what is possible to achieve with data or how data should be handled in terms of data privacy and storage; (2) distinguish current interoperability standards and obtain first insights into the processes leading to effective data transfer and analysis; and (3) value the use of newly developed technical approaches to utilize the full potential of clinical data.
Collapse
Affiliation(s)
- Markus Wolfien
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
- Center for Scalable Data Analytics and Artificial Intelligence, Dresden, Germany
| | - Najia Ahmadi
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Kai Fitzer
- Core Unit Data Integration Center, University Medicine Greifswald, Greifswald, Germany
| | - Sophia Grummt
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Kilian-Ludwig Heine
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Ian-C Jung
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Dagmar Krefting
- Department of Medical Informatics, University Medical Center, Goettingen, Germany
| | - Andreas Kühn
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Yuan Peng
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Ines Reinecke
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Julia Scheel
- Department of Systems Biology and Bioinformatics, University of Rostock, Rostock, Germany
| | - Tobias Schmidt
- Institute for Medical Informatics, University of Applied Sciences Mannheim, Mannheim, Germany
| | - Paul Schmücker
- Institute for Medical Informatics, University of Applied Sciences Mannheim, Mannheim, Germany
| | - Christina Schüttler
- Central Biobank Erlangen, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Dagmar Waltemath
- Core Unit Data Integration Center, University Medicine Greifswald, Greifswald, Germany
- Department of Medical Informatics, University Medicine Greifswald, Greifswald, Germany
| | - Michele Zoch
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Martin Sedlmayr
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
- Center for Scalable Data Analytics and Artificial Intelligence, Dresden, Germany
| |
Collapse
|
19
|
Lee DY, Choi B, Kim C, Fridgeirsson E, Reps J, Kim M, Kim J, Jang JW, Rhee SY, Seo WW, Lee S, Son SJ, Park RW. Privacy-Preserving Federated Model Predicting Bipolar Transition in Patients With Depression: Prediction Model Development Study. J Med Internet Res 2023; 25:e46165. [PMID: 37471130 PMCID: PMC10401196 DOI: 10.2196/46165] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Revised: 03/10/2023] [Accepted: 06/29/2023] [Indexed: 07/21/2023] Open
Abstract
BACKGROUND Mood disorder has emerged as a serious concern for public health; in particular, bipolar disorder has a less favorable prognosis than depression. Although prompt recognition of depression conversion to bipolar disorder is needed, early prediction is challenging due to overlapping symptoms. Recently, there have been attempts to develop a prediction model by using federated learning. Federated learning in medical fields is a method for training multi-institutional machine learning models without patient-level data sharing. OBJECTIVE This study aims to develop and validate a federated, differentially private multi-institutional bipolar transition prediction model. METHODS This retrospective study enrolled patients diagnosed with the first depressive episode at 5 tertiary hospitals in South Korea. We developed models for predicting bipolar transition by using data from 17,631 patients in 4 institutions. Further, we used data from 4541 patients for external validation from 1 institution. We created standardized pipelines to extract large-scale clinical features from the 4 institutions without any code modification. Moreover, we performed feature selection in a federated environment for computational efficiency and applied differential privacy to gradient updates. Finally, we compared the federated and the 4 local models developed with each hospital's data on internal and external validation data sets. RESULTS In the internal data set, 279 out of 17,631 patients showed bipolar disorder transition. In the external data set, 39 out of 4541 patients showed bipolar disorder transition. The average performance of the federated model in the internal test (area under the curve [AUC] 0.726) and external validation (AUC 0.719) data sets was higher than that of the other locally developed models (AUC 0.642-0.707 and AUC 0.642-0.699, respectively). In the federated model, classifications were driven by several predictors such as the Charlson index (low scores were associated with bipolar transition, which may be due to younger age), severe depression, anxiolytics, young age, and visiting months (the bipolar transition was associated with seasonality, especially during the spring and summer months). CONCLUSIONS We developed and validated a differentially private federated model by using distributed multi-institutional psychiatric data with standardized pipelines in a real-world environment. The federated model performed better than models using local data only.
Collapse
Affiliation(s)
- Dong Yun Lee
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon-si, Republic of Korea
| | - Byungjin Choi
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon-si, Republic of Korea
| | - Chungsoo Kim
- Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon-si, Republic of Korea
| | - Egill Fridgeirsson
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, Netherlands
| | - Jenna Reps
- Observational Health Data Analytics, Janssen Research and Development, Titusville, NJ, United States
| | - Myoungsuk Kim
- Data Solution Team, Evidnet Co, Ltd, Sungnam, Republic of Korea
| | - Jihyeong Kim
- Data Solution Team, Evidnet Co, Ltd, Sungnam, Republic of Korea
| | - Jae-Won Jang
- Department of Neurology, Kangwon National University Hospital, Kangwon National University School of Medicine, Chuncheon, Republic of Korea
| | - Sang Youl Rhee
- Center for Digital Health, Medical Science Research Institute, Kyung Hee University Medical Center, Seoul, Republic of Korea
- Department of Endocrinology and Metabolism, Kyung Hee University College of Medicine, Seoul, Republic of Korea
| | - Won-Woo Seo
- Department of Internal Medicine, Kangdong Sacred Heart Hospital, Hallym University College of Medicine, Seoul, Republic of Korea
| | - Seunghoon Lee
- Department of Psychiatry, Myongji Hospital, Goyang, Republic of Korea
| | - Sang Joon Son
- Department of Psychiatry, Ajou University School of Medicine, Suwon-si, Republic of Korea
| | - Rae Woong Park
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon-si, Republic of Korea
- Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon-si, Republic of Korea
| |
Collapse
|
20
|
Zhao R, Zhang W, Zhang Z, He C, Xu R, Tang X, Wang B. Evaluation of reporting quality of cohort studies using real-world data based on RECORD: systematic review. BMC Med Res Methodol 2023; 23:152. [PMID: 37386371 PMCID: PMC10308622 DOI: 10.1186/s12874-023-01960-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 05/31/2023] [Indexed: 07/01/2023] Open
Abstract
OBJECTIVE Real-world data (RWD) and real-world evidence (RWE) have been paid more and more attention in recent years. We aimed to evaluate the reporting quality of cohort studies using real-world data (RWD) published between 2013 and 2021 and analyze the possible factors. METHODS We conducted a comprehensive search in Medline and Embase through the OVID interface for cohort studies published from 2013 to 2021 on April 29, 2022. Studies aimed at comparing the effectiveness or safety of exposure factors in the real-world setting were included. The evaluation was based on the REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) statement. Agreement for inclusion and evaluation was calculated using Cohen's kappa. Pearson chi-square test or Fisher's exact test and Mann-Whitney U test were used to analyze the possible factors, including the release of RECORD, journal IFs, and article citations. Bonferroni's correction was conducted for multiple comparisons. Interrupted time series analysis was performed to display the changes in report quality over time. RESULTS 187 articles were finally included. The mean ± SD of the percentage of adequately reported items in the 187 articles was 44.7 ± 14.3 with a range of 11.1-87%. Of 23 items, the adequate reporting rate of 10 items reached 50%, and the reporting rate of some vital items was inadequate. After Bonferroni's correction, the reporting of only one item significantly improved after the release of RECORD and there was no significant improvement in the overall report quality. For interrupted time series analysis, there were no significant changes in the slope (p = 0.42) and level (p = 0.12) of adequate reporting rate. The journal IFs and citations were respectively related to 2 areas and the former significantly higher in high-reporting quality articles. CONCLUSION The endorsement of the RECORD cheklist was generally inadequate in cohort studies using RWD and has not improved in recent years. We encourage researchers to endorse relevant guidelines when utilizing RWD for research.
Collapse
Affiliation(s)
- Ran Zhao
- Institute of Information on Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing, China
| | - Wen Zhang
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, China
| | - ZeDan Zhang
- Traditional Chinese Medicine Data Center, China Academy of Chinese Medical Sciences, Beijing, China
| | - Chang He
- Institute of Information on Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing, China
| | - Rong Xu
- Guang'anmeng Hospital, China Academy of Chinese Medical Sciences, Beijing, China
| | - XuDong Tang
- China Academy of Chinese Medical Sciences, Beijing, China.
| | - Bin Wang
- Institute of Information on Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing, China.
- Traditional Chinese Medicine Data Center, China Academy of Chinese Medical Sciences, Beijing, China.
| |
Collapse
|
21
|
Wang M, Sushil M, Miao BY, Butte AJ. Bottom-up and top-down paradigms of artificial intelligence research approaches to healthcare data science using growing real-world big data. J Am Med Inform Assoc 2023; 30:1323-1332. [PMID: 37187158 PMCID: PMC10280344 DOI: 10.1093/jamia/ocad085] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 04/03/2023] [Accepted: 05/04/2023] [Indexed: 05/17/2023] Open
Abstract
OBJECTIVES As the real-world electronic health record (EHR) data continue to grow exponentially, novel methodologies involving artificial intelligence (AI) are becoming increasingly applied to enable efficient data-driven learning and, ultimately, to advance healthcare. Our objective is to provide readers with an understanding of evolving computational methods and help in deciding on methods to pursue. TARGET AUDIENCE The sheer diversity of existing methods presents a challenge for health scientists who are beginning to apply computational methods to their research. Therefore, this tutorial is aimed at scientists working with EHR data who are early entrants into the field of applying AI methodologies. SCOPE This manuscript describes the diverse and growing AI research approaches in healthcare data science and categorizes them into 2 distinct paradigms, the bottom-up and top-down paradigms to provide health scientists venturing into artificial intelligent research with an understanding of the evolving computational methods and help in deciding on methods to pursue through the lens of real-world healthcare data.
Collapse
Affiliation(s)
- Michelle Wang
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, California, USA
| | - Madhumita Sushil
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, California, USA
| | - Brenda Y Miao
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, California, USA
| | - Atul J Butte
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, California, USA
- Department of Pediatrics, University of California, San Francisco, San Francisco, California, USA
| |
Collapse
|
22
|
Bechler KK, Stolyar L, Steinberg E, Posada J, Minty E, Shah NH. Predicting patients who are likely to develop Lupus Nephritis of those newly diagnosed with Systemic Lupus Erythematosus. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2023; 2022:221-230. [PMID: 37128416 PMCID: PMC10148321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Patients diagnosed with systemic lupus erythematosus (SLE) suffer from a decreased quality of life, an increased risk of medical complications, and an increased risk of death. In particular, approximately 50% of SLE patients progress to develop lupus nephritis, which oftentimes leads to life-threatening end stage renal disease (ESRD) and requires dialysis or kidney transplant1. The challenge is that lupus nephritis is diagnosed via a kidney biopsy, which is typically performed only after noticeable decreased kidney function, leaving little room for proactive or preventative measures. The ability to predict which patients are most likely to develop lupus nephritis has the potential to shift lupus nephritis disease management from reactive to proactive. We present a clinically useful prediction model to predict which patients with newly diagnosed SLE will go on to develop lupus nephritis in the next five years.
Collapse
Affiliation(s)
- Katelyn K Bechler
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA
| | - Liya Stolyar
- Department of Medicine, Stanford University School of Medicine, Stanford, CA
| | - Ethan Steinberg
- Department of Computer Science, Stanford University, Stanford, CA
| | - Jose Posada
- Stanford Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford CA
- Department of Systems Engineering and Computing, Universidad del Norte, Barranquilla, Colombia
| | - Evan Minty
- O'Brien Institute for Public Health, Faculty of Medicine, University of Calgary, Canada
| | - Nigam H Shah
- Department of Medicine, Stanford University School of Medicine, Stanford, CA
- Stanford Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford CA
| |
Collapse
|
23
|
Rekkas A, van Klaveren D, Ryan PB, Steyerberg EW, Kent DM, Rijnbeek PR. A standardized framework for risk-based assessment of treatment effect heterogeneity in observational healthcare databases. NPJ Digit Med 2023; 6:58. [PMID: 36991144 DOI: 10.1038/s41746-023-00794-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Accepted: 03/10/2023] [Indexed: 03/31/2023] Open
Abstract
Treatment effects are often anticipated to vary across groups of patients with different baseline risk. The Predictive Approaches to Treatment Effect Heterogeneity (PATH) statement focused on baseline risk as a robust predictor of treatment effect and provided guidance on risk-based assessment of treatment effect heterogeneity in a randomized controlled trial. The aim of this study is to extend this approach to the observational setting using a standardized scalable framework. The proposed framework consists of five steps: (1) definition of the research aim, i.e., the population, the treatment, the comparator and the outcome(s) of interest; (2) identification of relevant databases; (3) development of a prediction model for the outcome(s) of interest; (4) estimation of relative and absolute treatment effect within strata of predicted risk, after adjusting for observed confounding; (5) presentation of the results. We demonstrate our framework by evaluating heterogeneity of the effect of thiazide or thiazide-like diuretics versus angiotensin-converting enzyme inhibitors on three efficacy and nine safety outcomes across three observational databases. We provide a publicly available R software package for applying this framework to any database mapped to the Observational Medical Outcomes Partnership Common Data Model. In our demonstration, patients at low risk of acute myocardial infarction receive negligible absolute benefits for all three efficacy outcomes, though they are more pronounced in the highest risk group, especially for acute myocardial infarction. Our framework allows for the evaluation of differential treatment effects across risk strata, which offers the opportunity to consider the benefit-harm trade-off between alternative treatments.
Collapse
Affiliation(s)
- Alexandros Rekkas
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands.
| | - David van Klaveren
- Department of Public Health, Erasmus University Medical Center, Rotterdam, The Netherlands
- Predictive Analytics and Comparative Effectiveness (PACE) Center, Institute for Clinical Research and Health Policy Studies (ICRHPS), Tufts Medical Center, Boston, MA, USA
| | - Patrick B Ryan
- Janssen Research and Development, 125 Trenton Harbourton Road, Titusville, NJ, 08560, USA
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York, USA
| | - Ewout W Steyerberg
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| | - David M Kent
- Predictive Analytics and Comparative Effectiveness (PACE) Center, Institute for Clinical Research and Health Policy Studies (ICRHPS), Tufts Medical Center, Boston, MA, USA
| | - Peter R Rijnbeek
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| |
Collapse
|
24
|
Junior EPP, Normando P, Flores-Ortiz R, Afzal MU, Jamil MA, Bertolin SF, Oliveira VDA, Martufi V, de Sousa F, Bashir A, Burn E, Ichihara MY, Barreto ML, Salles TD, Prieto-Alhambra D, Hafeez H, Khalid S. Integrating real-world data from Brazil and Pakistan into the OMOP common data model and standardized health analytics framework to characterize COVID-19 in the Global South. J Am Med Inform Assoc 2023; 30:643-655. [PMID: 36264262 PMCID: PMC9619798 DOI: 10.1093/jamia/ocac180] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 08/16/2022] [Accepted: 09/29/2022] [Indexed: 11/13/2022] Open
Abstract
OBJECTIVES The aim of this work is to demonstrate the use of a standardized health informatics framework to generate reliable and reproducible real-world evidence from Latin America and South Asia towards characterizing coronavirus disease 2019 (COVID-19) in the Global South. MATERIALS AND METHODS Patient-level COVID-19 records collected in a patient self-reported notification system, hospital in-patient and out-patient records, and community diagnostic labs were harmonized to the Observational Medical Outcomes Partnership common data model and analyzed using a federated network analytics framework. Clinical characteristics of individuals tested for, diagnosed with or tested positive for, hospitalized with, admitted to intensive care unit with, or dying with COVID-19 were estimated. RESULTS Two COVID-19 databases covering 8.3 million people from Pakistan and 2.6 million people from Bahia, Brazil were analyzed. 109 504 (Pakistan) and 921 (Brazil) medical concepts were harmonized to Observational Medical Outcomes Partnership common data model. In total, 341 505 (4.1%) people in the Pakistan dataset and 1 312 832 (49.2%) people in the Brazilian dataset were tested for COVID-19 between January 1, 2020 and April 20, 2022, with a median [IQR] age of 36 [25, 76] and 38 (27, 50); 40.3% and 56.5% were female in Pakistan and Brazil, respectively. 1.2% percent individuals in the Pakistan dataset had Afghan ethnicity. In Brazil, 52.3% had mixed ethnicity. In agreement with international findings, COVID-19 outcomes were more severe in men, elderly, and those with underlying health conditions. CONCLUSIONS COVID-19 data from 2 large countries in the Global South were harmonized and analyzed using a standardized health informatics framework developed by an international community of health informaticians. This proof-of-concept study demonstrates a potential open science framework for global knowledge mobilization and clinical translation for timely response to healthcare needs in pandemics and beyond.
Collapse
Affiliation(s)
- Elzo Pereira Pinto Junior
- Center of Data and Knowledge Integration for Health (Cidacs), Fiocruz-Brazil, Parque Tecnológico da Edf, Tecnocentro, R. Mundo, Salvador, BA 41745-715, Brazil
| | - Priscilla Normando
- Center of Data and Knowledge Integration for Health (Cidacs), Fiocruz-Brazil, Parque Tecnológico da Edf, Tecnocentro, R. Mundo, Salvador, BA 41745-715, Brazil
| | - Renzo Flores-Ortiz
- Center of Data and Knowledge Integration for Health (Cidacs), Fiocruz-Brazil, Parque Tecnológico da Edf, Tecnocentro, R. Mundo, Salvador, BA 41745-715, Brazil
| | - Muhammad Usman Afzal
- Shaukat Khanum Memorial Cancer Hospital and Research Centre, Johar Town, Lahore, 54840, Pakistan
| | - Muhammad Asaad Jamil
- Shaukat Khanum Memorial Cancer Hospital and Research Centre, Johar Town, Lahore, 54840, Pakistan
| | - Sergio Fernandez Bertolin
- Fundació Institut, Universitari per a la recerca a l'Atenció Primària de Salut Jordi Gol i Gurina (IDIAPJGol), Barcelona, 587 08007, Spain
| | - Vinícius de Araújo Oliveira
- Center of Data and Knowledge Integration for Health (Cidacs), Fiocruz-Brazil, Parque Tecnológico da Edf, Tecnocentro, R. Mundo, Salvador, BA 41745-715, Brazil
| | - Valentina Martufi
- Center of Data and Knowledge Integration for Health (Cidacs), Fiocruz-Brazil, Parque Tecnológico da Edf, Tecnocentro, R. Mundo, Salvador, BA 41745-715, Brazil
| | - Fernanda de Sousa
- Center of Data and Knowledge Integration for Health (Cidacs), Fiocruz-Brazil, Parque Tecnológico da Edf, Tecnocentro, R. Mundo, Salvador, BA 41745-715, Brazil
| | - Amir Bashir
- Shaukat Khanum Memorial Cancer Hospital and Research Centre, Johar Town, Lahore, 54840, Pakistan
| | - Edward Burn
- Centre for Statistics in Medicine, Botnar Research Centre, University of Oxford, Oxford, OX3 7LD, United Kingdom
| | - Maria Yury Ichihara
- Center of Data and Knowledge Integration for Health (Cidacs), Fiocruz-Brazil, Parque Tecnológico da Edf, Tecnocentro, R. Mundo, Salvador, BA 41745-715, Brazil
| | - Maurício L Barreto
- Center of Data and Knowledge Integration for Health (Cidacs), Fiocruz-Brazil, Parque Tecnológico da Edf, Tecnocentro, R. Mundo, Salvador, BA 41745-715, Brazil
| | - Talita Duarte Salles
- Fundació Institut, Universitari per a la recerca a l'Atenció Primària de Salut Jordi Gol i Gurina (IDIAPJGol), Barcelona, 587 08007, Spain
| | - Daniel Prieto-Alhambra
- Centre for Statistics in Medicine, Botnar Research Centre, University of Oxford, Oxford, OX3 7LD, United Kingdom
| | - Haroon Hafeez
- Shaukat Khanum Memorial Cancer Hospital and Research Centre, Johar Town, Lahore, 54840, Pakistan
| | - Sara Khalid
- Centre for Statistics in Medicine, Botnar Research Centre, University of Oxford, Oxford, OX3 7LD, United Kingdom
| |
Collapse
|
25
|
Guo LL, Steinberg E, Fleming SL, Posada J, Lemmon J, Pfohl SR, Shah N, Fries J, Sung L. EHR foundation models improve robustness in the presence of temporal distribution shift. Sci Rep 2023; 13:3767. [PMID: 36882576 PMCID: PMC9992466 DOI: 10.1038/s41598-023-30820-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Accepted: 03/02/2023] [Indexed: 03/09/2023] Open
Abstract
Temporal distribution shift negatively impacts the performance of clinical prediction models over time. Pretraining foundation models using self-supervised learning on electronic health records (EHR) may be effective in acquiring informative global patterns that can improve the robustness of task-specific models. The objective was to evaluate the utility of EHR foundation models in improving the in-distribution (ID) and out-of-distribution (OOD) performance of clinical prediction models. Transformer- and gated recurrent unit-based foundation models were pretrained on EHR of up to 1.8 M patients (382 M coded events) collected within pre-determined year groups (e.g., 2009-2012) and were subsequently used to construct patient representations for patients admitted to inpatient units. These representations were used to train logistic regression models to predict hospital mortality, long length of stay, 30-day readmission, and ICU admission. We compared our EHR foundation models with baseline logistic regression models learned on count-based representations (count-LR) in ID and OOD year groups. Performance was measured using area-under-the-receiver-operating-characteristic curve (AUROC), area-under-the-precision-recall curve, and absolute calibration error. Both transformer and recurrent-based foundation models generally showed better ID and OOD discrimination relative to count-LR and often exhibited less decay in tasks where there is observable degradation of discrimination performance (average AUROC decay of 3% for transformer-based foundation model vs. 7% for count-LR after 5-9 years). In addition, the performance and robustness of transformer-based foundation models continued to improve as pretraining set size increased. These results suggest that pretraining EHR foundation models at scale is a useful approach for developing clinical prediction models that perform well in the presence of temporal distribution shift.
Collapse
Affiliation(s)
- Lin Lawrence Guo
- Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, ON, Canada
| | - Ethan Steinberg
- Stanford Center for Biomedical Informatics Research, Stanford University, Palo Alto, CA, USA
| | - Scott Lanyon Fleming
- Stanford Center for Biomedical Informatics Research, Stanford University, Palo Alto, CA, USA
| | - Jose Posada
- Universidad del Norte, Barranquilla, Colombia
| | - Joshua Lemmon
- Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, ON, Canada
| | - Stephen R Pfohl
- Stanford Center for Biomedical Informatics Research, Stanford University, Palo Alto, CA, USA
| | - Nigam Shah
- Stanford Center for Biomedical Informatics Research, Stanford University, Palo Alto, CA, USA
| | - Jason Fries
- Stanford Center for Biomedical Informatics Research, Stanford University, Palo Alto, CA, USA
| | - Lillian Sung
- Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, ON, Canada.
- Division of Haematology/Oncology, The Hospital for Sick Children, 555 University Avenue, Toronto, ON, M5G1X8, Canada.
| |
Collapse
|
26
|
Chandran U, Reps J, Yang R, Vachani A, Maldonado F, Kalsekar I. Machine Learning and Real-World Data to Predict Lung Cancer Risk in Routine Care. Cancer Epidemiol Biomarkers Prev 2023; 32:337-343. [PMID: 36576991 PMCID: PMC9986687 DOI: 10.1158/1055-9965.epi-22-0873] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 10/07/2022] [Accepted: 12/19/2022] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND This study used machine learning to develop a 3-year lung cancer risk prediction model with large real-world data in a mostly younger population. METHODS Over 4.7 million individuals, aged 45 to 65 years with no history of any cancer or lung cancer screening, diagnostic, or treatment procedures, with an outpatient visit in 2013 were identified in Optum's de-identified Electronic Health Record (EHR) dataset. A least absolute shrinkage and selection operator model was fit using all available data in the 365 days prior. Temporal validation was assessed with recent data. External validation was assessed with data from Mercy Health Systems EHR and Optum's de-identified Clinformatics Data Mart Database. Racial inequities in model discrimination were assessed with xAUCs. RESULTS The model AUC was 0.76. Top predictors included age, smoking, race, ethnicity, and diagnosis of chronic obstructive pulmonary disease. The model identified a high-risk group with lung cancer incidence 9 times the average cohort incidence, representing 10% of patients with lung cancer. Model performed well temporally and externally, while performance was reduced for Asians and Hispanics. CONCLUSIONS A high-dimensional model trained using big data identified a subset of patients with high lung cancer risk. The model demonstrated transportability to EHR and claims data, while underscoring the need to assess racial disparities when using machine learning methods. IMPACT This internally and externally validated real-world data-based lung cancer prediction model is available on an open-source platform for broad sharing and application. Model integration into an EHR system could minimize physician burden by automating identification of high-risk patients.
Collapse
Affiliation(s)
- Urmila Chandran
- Johnson & Johnson Global Epidemiology, Titusville, New Jersey.,Lung Cancer Initiative, Johnson & Johnson, New Brunswick, New Jersey
| | - Jenna Reps
- Johnson & Johnson Global Epidemiology, Titusville, New Jersey
| | - Robert Yang
- Lung Cancer Initiative, Johnson & Johnson, New Brunswick, New Jersey
| | - Anil Vachani
- University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania
| | | | - Iftekhar Kalsekar
- Lung Cancer Initiative, Johnson & Johnson, New Brunswick, New Jersey
| |
Collapse
|
27
|
Lee DY, Cho YH, Kim M, Jeong CW, Cha JM, Won GH, Noh JS, Son SJ, Park RW. Association between impaired glucose metabolism and long-term prognosis at the time of diagnosis of depression: Impaired glucose metabolism as a promising biomarker proposed through a machine-learning approach. Eur Psychiatry 2023; 66:e21. [PMID: 36734114 PMCID: PMC9970146 DOI: 10.1192/j.eurpsy.2023.10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND Predicting the course of depression is necessary for personalized treatment. Impaired glucose metabolism (IGM) was introduced as a promising depression biomarker, but no consensus was made. This study aimed to predict IGM at the time of depression diagnosis and examine the relationship between long-term prognosis and predicted results. METHODS Clinical data were extracted from four electronic health records in South Korea. The study population included patients with depression, and the outcome was IGM within 1 year. One database was used to develop the model using three algorithms. External validation was performed using the best algorithm across the three databases. The area under the curve (AUC) was calculated to determine the model's performance. Kaplan-Meier and Cox survival analyses of the risk of hospitalization for depression as the long-term outcome were performed. A meta-analysis of the long-term outcome was performed across the four databases. RESULTS A prediction model was developed using the data of 3,668 people, with an AUC of 0.781 with least absolute shrinkage and selection operator (LASSO) logistic regression. In the external validation, the AUCs were 0.643, 0.610, and 0.515. Through the predicted results, survival analysis and meta-analysis were performed; the hazard ratios of risk of hospitalization for depression in patients predicted to have IGM was 1.20 (95% confidence interval [CI] 1.02-1.41, p = 0.027) at a 3-year follow-up. CONCLUSIONS We developed prediction models for IGM occurrence within a year. The predicted results were related to the long-term prognosis of depression, presenting as a promising IGM biomarker related to the prognosis of depression.
Collapse
Affiliation(s)
- Dong Yun Lee
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Korea
| | - Yong Hyuk Cho
- Department of Psychiatry, Ajou University School of Medicine, Suwon, Korea.,Department of Medical Sciences, Graduate School of Ajou University, Suwon, South Korea
| | | | - Chang-Won Jeong
- Medical Convergence Research Center, Wonkwang University, Iksan, Korea
| | - Jae Myung Cha
- Department of Gastroenterology, Gang Dong Kyung Hee University Hospital, Seoul, Korea
| | - Geun Hui Won
- Department of Psychiatry, Catholic University of Daegu School of Medicine, Daegu, Korea
| | - Jai Sung Noh
- Department of Psychiatry, Ajou University School of Medicine, Suwon, Korea
| | - Sang Joon Son
- Department of Psychiatry, Ajou University School of Medicine, Suwon, Korea
| | - Rae Woong Park
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Korea
| |
Collapse
|
28
|
Park K, Cho M, Song M, Yoo S, Baek H, Kim S, Kim K. Exploring the potential of OMOP common data model for process mining in healthcare. PLoS One 2023; 18:e0279641. [PMID: 36595527 PMCID: PMC9810199 DOI: 10.1371/journal.pone.0279641] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Accepted: 12/09/2022] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND AND OBJECTIVE Recently, Electronic Health Records (EHR) are increasingly being converted to Common Data Models (CDMs), a database schema designed to provide standardized vocabularies to facilitate collaborative observational research. To date, however, rare attempts exist to leverage CDM data for healthcare process mining, a technique to derive process-related knowledge (e.g., process model) from event logs. This paper presents a method to extract, construct, and analyze event logs from the Observational Medical Outcomes Partnership (OMOP) CDM for process mining and demonstrates CDM-based healthcare process mining with several real-life study cases while answering frequently posed questions in process mining, in the CDM environment. METHODS We propose a method to extract, construct, and analyze event logs from the OMOP CDM for process types including inpatient, outpatient, emergency room processes, and patient journey. Using the proposed method, we extract the retrospective data of several surgical procedure cases (i.e., Total Laparoscopic Hysterectomy (TLH), Total Hip Replacement (THR), Coronary Bypass (CB), Transcatheter Aortic Valve Implantation (TAVI), Pancreaticoduodenectomy (PD)) from the CDM of a Korean tertiary hospital. Patient data are extracted for each of the operations and analyzed using several process mining techniques. RESULTS Using process mining, the clinical pathways, outpatient process models, emergency room process models, and patient journeys are demonstrated using the extracted logs. The result shows CDM's usability as a novel and valuable data source for healthcare process analysis, yet with a few considerations. We found that CDM should be complemented by different internal and external data sources to address the administrative and operational aspects of healthcare processes, particularly for outpatient and ER process analyses. CONCLUSION To the best of our knowledge, we are the first to exploit CDM for healthcare process mining. Specifically, we provide a step-by-step guidance by demonstrating process analysis from locating relevant CDM tables to visualizing results using process mining tools. The proposed method can be widely applicable across different institutions. This work can contribute to bringing a process mining perspective to the existing CDM users in the changing Hospital Information Systems (HIS) environment and also to facilitating CDM-based studies in the process mining research community.
Collapse
Affiliation(s)
- Kangah Park
- Department of Industrial and Management Engineering, Pohang University of Science and Technology (POSTECH), Pohang, South Korea
| | - Minsu Cho
- School of Information Convergence, Kwangwoon University, Seoul, South Korea
| | - Minseok Song
- Department of Industrial and Management Engineering, Pohang University of Science and Technology (POSTECH), Pohang, South Korea
- * E-mail: (MS); (SY)
| | - Sooyoung Yoo
- Healthcare ICT Research Center, Office of eHealth Research and Businesses, Seoul National University Bundang Hospital, Seongnam, South Korea
- * E-mail: (MS); (SY)
| | - Hyunyoung Baek
- Healthcare ICT Research Center, Office of eHealth Research and Businesses, Seoul National University Bundang Hospital, Seongnam, South Korea
| | - Seok Kim
- Healthcare ICT Research Center, Office of eHealth Research and Businesses, Seoul National University Bundang Hospital, Seongnam, South Korea
| | - Kidong Kim
- Department of Obstetrics and Gynecology, Seoul National University Bundang Hospital, Seongnam, South Korea
| |
Collapse
|
29
|
John LH, Kors JA, Fridgeirsson EA, Reps JM, Rijnbeek PR. External validation of existing dementia prediction models on observational health data. BMC Med Res Methodol 2022; 22:311. [PMID: 36471238 PMCID: PMC9720950 DOI: 10.1186/s12874-022-01793-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 11/15/2022] [Indexed: 12/07/2022] Open
Abstract
BACKGROUND Many dementia prediction models have been developed, but only few have been externally validated, which hinders clinical uptake and may pose a risk if models are applied to actual patients regardless. Externally validating an existing prediction model is a difficult task, where we mostly rely on the completeness of model reporting in a published article. In this study, we aim to externally validate existing dementia prediction models. To that end, we define model reporting criteria, review published studies, and externally validate three well reported models using routinely collected health data from administrative claims and electronic health records. METHODS We identified dementia prediction models that were developed between 2011 and 2020 and assessed if they could be externally validated given a set of model criteria. In addition, we externally validated three of these models (Walters' Dementia Risk Score, Mehta's RxDx-Dementia Risk Index, and Nori's ADRD dementia prediction model) on a network of six observational health databases from the United States, United Kingdom, Germany and the Netherlands, including the original development databases of the models. RESULTS We reviewed 59 dementia prediction models. All models reported the prediction method, development database, and target and outcome definitions. Less frequently reported by these 59 prediction models were predictor definitions (52 models) including the time window in which a predictor is assessed (21 models), predictor coefficients (20 models), and the time-at-risk (42 models). The validation of the model by Walters (development c-statistic: 0.84) showed moderate transportability (0.67-0.76 c-statistic). The Mehta model (development c-statistic: 0.81) transported well to some of the external databases (0.69-0.79 c-statistic). The Nori model (development AUROC: 0.69) transported well (0.62-0.68 AUROC) but performed modestly overall. Recalibration showed improvements for the Walters and Nori models, while recalibration could not be assessed for the Mehta model due to unreported baseline hazard. CONCLUSION We observed that reporting is mostly insufficient to fully externally validate published dementia prediction models, and therefore, it is uncertain how well these models would work in other clinical settings. We emphasize the importance of following established guidelines for reporting clinical prediction models. We recommend that reporting should be more explicit and have external validation in mind if the model is meant to be applied in different settings.
Collapse
Affiliation(s)
- Luis H. John
- grid.5645.2000000040459992XDepartment of Medical Informatics, Erasmus University Medical Center, Dr. Molewaterplein 40, 3015 GD Rotterdam, The Netherlands
| | - Jan A. Kors
- grid.5645.2000000040459992XDepartment of Medical Informatics, Erasmus University Medical Center, Dr. Molewaterplein 40, 3015 GD Rotterdam, The Netherlands
| | - Egill A. Fridgeirsson
- grid.5645.2000000040459992XDepartment of Medical Informatics, Erasmus University Medical Center, Dr. Molewaterplein 40, 3015 GD Rotterdam, The Netherlands
| | - Jenna M. Reps
- grid.497530.c0000 0004 0389 4927Janssen Research and Development, 1125 Trenton Harbourton Rd, NJ 08560 Titusville, USA
| | - Peter R. Rijnbeek
- grid.5645.2000000040459992XDepartment of Medical Informatics, Erasmus University Medical Center, Dr. Molewaterplein 40, 3015 GD Rotterdam, The Netherlands
| |
Collapse
|
30
|
You SC, Lee S, Choi B, Park RW. Establishment of an International Evidence Sharing Network Through Common Data Model for Cardiovascular Research. Korean Circ J 2022; 52:853-864. [PMID: 36478647 PMCID: PMC9742390 DOI: 10.4070/kcj.2022.0294] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 11/10/2022] [Indexed: 08/21/2023] Open
Abstract
A retrospective observational study is one of the most widely used research methods in medicine. However, evidence postulated from a single data source likely contains biases such as selection bias, information bias, and confounding bias. Acquiring enough data from multiple institutions is one of the most effective methods to overcome the limitations. However, acquiring data from multiple institutions from many countries requires enormous effort because of financial, technical, ethical, and legal issues as well as standardization of data structure and semantics. The Observational Health Data Sciences and Informatics (OHDSI) research network standardized 928 million unique records or 12% of the world's population into a common structure and meaning and established a research network of 453 data partners from 41 countries around the world. OHDSI is a distributed research network wherein researchers do not own or directly share data but only analyzed results. However, sharing evidence without sharing data is difficult to understand. In this review, we will look at the basic principles of OHDSI, common data model, distributed research networks, and some representative studies in the cardiovascular field using the network. This paper also briefly introduces a Korean distributed research network named FeederNet.
Collapse
Affiliation(s)
- Seng Chan You
- Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Seoul, Korea
- Institute for Innovation in Digital Healthcare, Yonsei University, Seoul, Korea
| | - Seongwon Lee
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Korea
| | - Byungjin Choi
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Korea
- Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, Korea
| | - Rae Woong Park
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Korea
- Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, Korea.
| |
Collapse
|
31
|
Semantic Data Visualisation for Biomedical Database Catalogues. Healthcare (Basel) 2022; 10:healthcare10112287. [DOI: 10.3390/healthcare10112287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 11/08/2022] [Accepted: 11/10/2022] [Indexed: 11/16/2022] Open
Abstract
Biomedical databases often have restricted access policies and governance rules. Thus, an adequate description of their content is essential for researchers who wish to use them for medical research. A strategy for publishing information without disclosing patient-level data is through database fingerprinting and aggregate characterisations. However, this information is still presented in a format that makes it challenging to search, analyse, and decide on the best databases for a domain of study. Several strategies allow one to visualise and compare the characteristics of multiple biomedical databases. Our study focused on a European platform for sharing and disseminating biomedical data. We use semantic data visualisation techniques to assist in comparing descriptive metadata from several databases. The great advantage lies in streamlining the database selection process, ensuring that sensitive details are not shared. To address this goal, we have considered two levels of data visualisation, one characterising a single database and the other involving multiple databases in network-level visualisations. This study revealed the impact of the proposed visualisations and some open challenges in representing semantically annotated biomedical datasets. Identifying future directions in this scope was one of the outcomes of this work.
Collapse
|
32
|
Yang C, Williams RD, Swerdel JN, Almeida JR, Brouwer ES, Burn E, Carmona L, Chatzidionysiou K, Duarte-Salles T, Fakhouri W, Hottgenroth A, Jani M, Kolde R, Kors JA, Kullamaa L, Lane J, Marinier K, Michel A, Stewart HM, Prats-Uribe A, Reisberg S, Sena AG, Torre CO, Verhamme K, Vizcaya D, Weaver J, Ryan P, Prieto-Alhambra D, Rijnbeek PR. Development and external validation of prediction models for adverse health outcomes in rheumatoid arthritis: A multinational real-world cohort analysis. Semin Arthritis Rheum 2022; 56:152050. [PMID: 35728447 DOI: 10.1016/j.semarthrit.2022.152050] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 05/11/2022] [Accepted: 06/10/2022] [Indexed: 10/18/2022]
Abstract
BACKGROUND Identification of rheumatoid arthritis (RA) patients at high risk of adverse health outcomes remains a major challenge. We aimed to develop and validate prediction models for a variety of adverse health outcomes in RA patients initiating first-line methotrexate (MTX) monotherapy. METHODS Data from 15 claims and electronic health record databases across 9 countries were used. Models were developed and internally validated on Optum® De-identified Clinformatics® Data Mart Database using L1-regularized logistic regression to estimate the risk of adverse health outcomes within 3 months (leukopenia, pancytopenia, infection), 2 years (myocardial infarction (MI) and stroke), and 5 years (cancers [colorectal, breast, uterine] after treatment initiation. Candidate predictors included demographic variables and past medical history. Models were externally validated on all other databases. Performance was assessed using the area under the receiver operator characteristic curve (AUC) and calibration plots. FINDINGS Models were developed and internally validated on 21,547 RA patients and externally validated on 131,928 RA patients. Models for serious infection (AUC: internal 0.74, external ranging from 0.62 to 0.83), MI (AUC: internal 0.76, external ranging from 0.56 to 0.82), and stroke (AUC: internal 0.77, external ranging from 0.63 to 0.95), showed good discrimination and adequate calibration. Models for the other outcomes showed modest internal discrimination (AUC < 0.65) and were not externally validated. INTERPRETATION We developed and validated prediction models for a variety of adverse health outcomes in RA patients initiating first-line MTX monotherapy. Final models for serious infection, MI, and stroke demonstrated good performance across multiple databases and can be studied for clinical use. FUNDING This activity under the European Health Data & Evidence Network (EHDEN) has received funding from the Innovative Medicines Initiative 2 Joint Undertaking under grant agreement No 806968. This Joint Undertaking receives support from the European Union's Horizon 2020 research and innovation programme and EFPIA.
Collapse
Affiliation(s)
- Cynthia Yang
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands.
| | - Ross D Williams
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Joel N Swerdel
- Janssen Research and Development, Titusville, NJ, United States
| | | | - Emily S Brouwer
- Janssen Research and Development, Titusville, NJ, United States
| | - Edward Burn
- Nuffield Department of Orthopaedics Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, United Kingdom; Fundació Institut Universitari per a la recerca a l'Atenció Primària de Salut Jordi Gol i Gurina (IDIAPJGol), Barcelona, Spain
| | | | | | - Talita Duarte-Salles
- Fundació Institut Universitari per a la recerca a l'Atenció Primària de Salut Jordi Gol i Gurina (IDIAPJGol), Barcelona, Spain
| | - Walid Fakhouri
- Eli Lilly and Company, Windlesham, Surrey, United Kingdom
| | | | - Meghna Jani
- Centre for Epidemiology Versus Arthritis, University of Manchester, Manchester, United Kingdom
| | - Raivo Kolde
- Institute of Computer Science, University of Tartu, Tartu, Estonia
| | - Jan A Kors
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Lembe Kullamaa
- Department of Epidemiology and Biostatistics, National Institute for Health Development, Tallinn, Estonia; Institute of Family Medicine and Public Health, University of Tartu, Tartu, Estonia; European Patients' Forum, Brussels, Belgium
| | - Jennifer Lane
- Nuffield Department of Orthopaedics Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, United Kingdom
| | | | | | | | - Albert Prats-Uribe
- Nuffield Department of Orthopaedics Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, United Kingdom
| | - Sulev Reisberg
- Institute of Computer Science, University of Tartu, Tartu, Estonia; STACC, Tartu, Estonia; Quretec, Tartu, Estonia
| | - Anthony G Sena
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands; Janssen Research and Development, Titusville, NJ, United States
| | | | - Katia Verhamme
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | | | - James Weaver
- Janssen Research and Development, Titusville, NJ, United States; Observational Health Data Sciences and Informatics, New York, NY, United States
| | - Patrick Ryan
- Janssen Research and Development, Titusville, NJ, United States; Observational Health Data Sciences and Informatics, New York, NY, United States
| | - Daniel Prieto-Alhambra
- Nuffield Department of Orthopaedics Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, United Kingdom
| | - Peter R Rijnbeek
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| |
Collapse
|
33
|
Kiser AC, Eilbeck K, Ferraro JP, Skarda DE, Samore MH, Bucher B. Standard Vocabularies to Improve Machine Learning Model Transferability With Electronic Health Record Data: Retrospective Cohort Study Using Health Care-Associated Infection. JMIR Med Inform 2022; 10:e39057. [PMID: 36040784 PMCID: PMC9472055 DOI: 10.2196/39057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Revised: 08/09/2022] [Accepted: 08/15/2022] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND With the widespread adoption of electronic healthcare records (EHRs) by US hospitals, there is an opportunity to leverage this data for the development of predictive algorithms to improve clinical care. A key barrier in model development and implementation includes the external validation of model discrimination, which is rare and often results in worse performance. One reason why machine learning models are not externally generalizable is data heterogeneity. A potential solution to address the substantial data heterogeneity between health care systems is to use standard vocabularies to map EHR data elements. The advantage of these vocabularies is a hierarchical relationship between elements, which allows the aggregation of specific clinical features to more general grouped concepts. OBJECTIVE This study aimed to evaluate grouping EHR data using standard vocabularies to improve the transferability of machine learning models for the detection of postoperative health care-associated infections across institutions with different EHR systems. METHODS Patients who underwent surgery from the University of Utah Health and Intermountain Healthcare from July 2014 to August 2017 with complete follow-up data were included. The primary outcome was a health care-associated infection within 30 days of the procedure. EHR data from 0-30 days after the operation were mapped to standard vocabularies and grouped using the hierarchical relationships of the vocabularies. Model performance was measured using the area under the receiver operating characteristic curve (AUC) and F1-score in internal and external validations. To evaluate model transferability, a difference-in-difference metric was defined as the difference in performance drop between internal and external validations for the baseline and grouped models. RESULTS A total of 5775 patients from the University of Utah and 15,434 patients from Intermountain Healthcare were included. The prevalence of selected outcomes was from 4.9% (761/15,434) to 5% (291/5775) for surgical site infections, from 0.8% (44/5775) to 1.1% (171/15,434) for pneumonia, from 2.6% (400/15,434) to 3% (175/5775) for sepsis, and from 0.8% (125/15,434) to 0.9% (50/5775) for urinary tract infections. In all outcomes, the grouping of data using standard vocabularies resulted in a reduced drop in AUC and F1-score in external validation compared to baseline features (all P<.001, except urinary tract infection AUC: P=.002). The difference-in-difference metrics ranged from 0.005 to 0.248 for AUC and from 0.075 to 0.216 for F1-score. CONCLUSIONS We demonstrated that grouping machine learning model features based on standard vocabularies improved model transferability between data sets across 2 institutions. Improving model transferability using standard vocabularies has the potential to improve the generalization of clinical prediction models across the health care system.
Collapse
Affiliation(s)
- Amber C Kiser
- Department of Biomedical Informatics, School of Medicine, University of Utah, Salt Lake City, UT, United States
| | - Karen Eilbeck
- Department of Biomedical Informatics, School of Medicine, University of Utah, Salt Lake City, UT, United States
| | - Jeffrey P Ferraro
- Department of Medicine, School of Medicine, University of Utah, Salt Lake City, UT, United States
| | - David E Skarda
- Center for Value-Based Surgery, Intermountain Healthcare, Salt Lake City, UT, United States.,Department of Surgery, School of Medicine, University of Utah, Salt Lake City, UT, United States
| | - Matthew H Samore
- Department of Medicine, School of Medicine, University of Utah, Salt Lake City, UT, United States.,Informatics, Decision-Enhancement and Analytic Sciences Center 2.0, Veterans Affairs Salt Lake City Health Care System, Salt Lake City, UT, United States
| | - Brian Bucher
- Department of Biomedical Informatics, School of Medicine, University of Utah, Salt Lake City, UT, United States.,Department of Surgery, School of Medicine, University of Utah, Salt Lake City, UT, United States
| |
Collapse
|
34
|
Li Y, Dong W, Ru B, Black A, Zhang X, Guan Y. Generic Medical Concept Embedding and Time Decay for Diverse Patient Outcome Prediction Tasks. iScience 2022; 25:104880. [PMID: 36039302 PMCID: PMC9418804 DOI: 10.1016/j.isci.2022.104880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 05/10/2022] [Accepted: 08/02/2022] [Indexed: 11/18/2022] Open
Abstract
Many fields, including Natural Language Processing (NLP), have recently witnessed the benefit of pre-training with large generic datasets to improve the accuracy of prediction tasks. However, there exist key differences between the longitudinal healthcare data (e.g., claims) and NLP tasks, which make the direct application of NLP pre-training methods to healthcare data inappropriate. In this article, we developed a pre-training scheme for longitudinal healthcare data that leverages the pairing of medical history and a future event. We then conducted systematic evaluations of various methods on ten patient-level prediction tasks encompassing adverse events, misdiagnosis, disease risks, and readmission. In addition to substantially reducing model size, our results show that a universal medical concept embedding pretrained with generic big data as well as carefully designed time decay modeling improves the accuracy of different downstream prediction tasks. This work develops a pre-training scheme for longitudinal healthcare data The method leverages the pairing of medical history and a future event We created a universal medical concept embedding pretrained with generic data We designed a time-decay method for medical concept data
Collapse
|
35
|
Yoo J, Lee J, Min JY, Choi SW, Kwon JM, Cho I, Lim C, Choi MY, Cha WC. Development of an Interoperable and Easily Transferable Clinical Decision Support System Deployment Platform: System Design and Development Study. J Med Internet Res 2022; 24:e37928. [PMID: 35896020 PMCID: PMC9377482 DOI: 10.2196/37928] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 06/18/2022] [Accepted: 07/10/2022] [Indexed: 11/20/2022] Open
Abstract
BACKGROUND A clinical decision support system (CDSS) is recognized as a technology that enhances clinical efficacy and safety. However, its full potential has not been realized, mainly due to clinical data standards and noninteroperable platforms. OBJECTIVE In this paper, we introduce the common data model-based intelligent algorithm network environment (CANE) platform that supports the implementation and deployment of a CDSS. METHODS CDSS reasoning engines, usually represented as R or Python objects, are deployed into the CANE platform and converted into C# objects. When a clinician requests CANE-based decision support in the electronic health record (EHR) system, patients' information is transformed into Health Level 7 Fast Healthcare Interoperability Resources (FHIR) format and transmitted to the CANE server inside the hospital firewall. Upon receiving the necessary data, the CANE system's modules perform the following tasks: (1) the preprocessing module converts the FHIRs into the input data required by the specific reasoning engine, (2) the reasoning engine module operates the target algorithms, (3) the integration module communicates with the other institutions' CANE systems to request and transmit a summary report to aid in decision support, and (4) creates a user interface by integrating the summary report and the results calculated by the reasoning engine. RESULTS We developed a CANE system such that any algorithm implemented in the system can be directly called through the RESTful application programming interface when it is integrated with an EHR system. Eight algorithms were developed and deployed in the CANE system. Using a knowledge-based algorithm, physicians can screen patients who are prone to sepsis and obtain treatment guides for patients with sepsis with the CANE system. Further, using a nonknowledge-based algorithm, the CANE system supports emergency physicians' clinical decisions about optimum resource allocation by predicting a patient's acuity and prognosis during triage. CONCLUSIONS We successfully developed a common data model-based platform that adheres to medical informatics standards and could aid artificial intelligence model deployment using R or Python.
Collapse
Affiliation(s)
- Junsang Yoo
- Department of Digital Health, Samsung Advanced Institute for Health Science & Technology, Sungkyunkwan University, Seoul, Republic of Korea
| | | | | | - Sae Won Choi
- Office of Hospital Information, Seoul National University Hospital, Seoul, Republic of Korea
| | | | - Insook Cho
- Nursing Department, School of Medicine, Inha University, Incheon, Republic of Korea
| | - Chiyeon Lim
- Department of Biostatistics, Dongguk University School of Medicine, Goyang, Republic of Korea
| | - Mi Young Choi
- Data Service Center, en-core Co, Ltd, Seoul, Republic of Korea
| | - Won Chul Cha
- Department of Digital Health, Samsung Advanced Institute for Health Science & Technology, Sungkyunkwan University, Seoul, Republic of Korea
- Department of Emergency Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
- Digital Innovation Center, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| |
Collapse
|
36
|
Mokgokong R, Schnabel R, Witt H, Miller R, Lee TC. Performance of an electronic health record-based predictive model to identify patients with atrial fibrillation across countries. PLoS One 2022; 17:e0269867. [PMID: 35802569 PMCID: PMC9269467 DOI: 10.1371/journal.pone.0269867] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Accepted: 05/27/2022] [Indexed: 11/30/2022] Open
Abstract
Background Atrial fibrillation (AF) burden on patients and healthcare systems warrants innovative strategies for screening asymptomatic individuals. Objective We sought to externally validate a predictive model originally developed in a German population to detect unidentified incident AF utilising real-world primary healthcare databases from countries in Europe and Australia. Methods This retrospective cohort study used anonymized, longitudinal patient data from 5 country-level primary care databases, including Australia, Belgium, France, Germany, and the UK. The study eligibility included adult patients (≥45 years) with either an AF diagnosis (cases) or no diagnosis (controls) who had continuous enrolment in the respective database prior to the study period. Logistic regression was fitted to a binary response (yes/no) for AF diagnosis using pre-determined risk factors. Results AF patients were from Germany (n = 63,562), the UK (n = 42,652), France (n = 7,213), Australia (n = 2,753), and Belgium (n = 1,371). Cases were more likely to have hypertension or other cardiac conditions than controls in all validation datasets compared to the model development data. The area under the receiver operating characteristic (ROC) curve in the validation datasets ranged from 0.79 (Belgium) to 0.84 (Germany), comparable to the German study model, which had an area under the curve of 0.83. Most validation sets reported similar specificity at approximately 80% sensitivity, ranging from 67% (France) to 71% (United Kingdom). The positive predictive value (PPV) ranged from 2% (Belgium) to 16% (Germany), and the number needed to be screened was 50 in Belgium and 6 in Germany. The prevalence of AF varied widely between these datasets, which may be related to different coding practices. Low prevalence affected PPV, but not sensitivity, specificity, and ROC curves. Conclusions AF risk prediction algorithms offer targeted ways to identify patients using electronic health records, which could improve screening number and the cost-effectiveness of AF screening if implemented in clinical practice.
Collapse
Affiliation(s)
- Ruth Mokgokong
- Health Economics & Outcomes Research, Internal Medicine, Pfizer, Surrey, United Kingdom
- * E-mail:
| | - Renate Schnabel
- University Heart & Vascular Center Hamburg Eppendorf, Hamburg, Germany
- German Center for Cardiovascular Research (DZHK) Partner Site, Hamburg/Kiel/Lübeck, Germany
| | | | | | - Theodore C. Lee
- Internal Medicine, Pfizer, New York City, New York, United States of America
| |
Collapse
|
37
|
John LH, Kors JA, Reps JM, Ryan PB, Rijnbeek PR. Logistic regression models for patient-level prediction based on massive observational data: Do we need all data? Int J Med Inform 2022; 163:104762. [DOI: 10.1016/j.ijmedinf.2022.104762] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 04/07/2022] [Accepted: 04/08/2022] [Indexed: 01/16/2023]
|
38
|
van Os HJA, Kanning JP, Wermer MJH, Chavannes NH, Numans ME, Ruigrok YM, van Zwet EW, Putter H, Steyerberg EW, Groenwold RHH. Developing Clinical Prediction Models Using Primary Care Electronic Health Record Data: The Impact of Data Preparation Choices on Model Performance. FRONTIERS IN EPIDEMIOLOGY 2022; 2:871630. [PMID: 38455328 PMCID: PMC10910909 DOI: 10.3389/fepid.2022.871630] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 04/11/2022] [Indexed: 03/09/2024]
Abstract
Objective To quantify prediction model performance in relation to data preparation choices when using electronic health records (EHR). Study Design and Setting Cox proportional hazards models were developed for predicting the first-ever main adverse cardiovascular events using Dutch primary care EHR data. The reference model was based on a 1-year run-in period, cardiovascular events were defined based on both EHR diagnosis and medication codes, and missing values were multiply imputed. We compared data preparation choices based on (i) length of the run-in period (2- or 3-year run-in); (ii) outcome definition (EHR diagnosis codes or medication codes only); and (iii) methods addressing missing values (mean imputation or complete case analysis) by making variations on the derivation set and testing their impact in a validation set. Results We included 89,491 patients in whom 6,736 first-ever main adverse cardiovascular events occurred during a median follow-up of 8 years. Outcome definition based only on diagnosis codes led to a systematic underestimation of risk (calibration curve intercept: 0.84; 95% CI: 0.83-0.84), while complete case analysis led to overestimation (calibration curve intercept: -0.52; 95% CI: -0.53 to -0.51). Differences in the length of the run-in period showed no relevant impact on calibration and discrimination. Conclusion Data preparation choices regarding outcome definition or methods to address missing values can have a substantial impact on the calibration of predictions, hampering reliable clinical decision support. This study further illustrates the urgency of transparent reporting of modeling choices in an EHR data setting.
Collapse
Affiliation(s)
- Hendrikus J. A. van Os
- Department of Neurology, Leiden University Medical Hospital, Leiden, Netherlands
- National eHealth Living Lab, Leiden University Medical Hospital, Leiden, Netherlands
- Department of Public Health & Primary Care, Leiden University Medical Hospital, Leiden, Netherlands
| | - Jos P. Kanning
- Department of Neurology, University Medical Center Utrecht, Utrecht, Netherlands
| | - Marieke J. H. Wermer
- Department of Neurology, Leiden University Medical Hospital, Leiden, Netherlands
| | - Niels H. Chavannes
- National eHealth Living Lab, Leiden University Medical Hospital, Leiden, Netherlands
- Department of Public Health & Primary Care, Leiden University Medical Hospital, Leiden, Netherlands
| | - Mattijs E. Numans
- Department of Public Health & Primary Care, Leiden University Medical Hospital, Leiden, Netherlands
| | - Ynte M. Ruigrok
- Department of Neurology, University Medical Center Utrecht, Utrecht, Netherlands
| | - Erik W. van Zwet
- Department of Biomedical Data Sciences, Leiden University Medical Hospital, Leiden, Netherlands
| | - Hein Putter
- Department of Biomedical Data Sciences, Leiden University Medical Hospital, Leiden, Netherlands
| | - Ewout W. Steyerberg
- Department of Biomedical Data Sciences, Leiden University Medical Hospital, Leiden, Netherlands
| | - Rolf H. H. Groenwold
- Department of Biomedical Data Sciences, Leiden University Medical Hospital, Leiden, Netherlands
- Department of Clinical Epidemiology, Leiden University Medical Hospital, Leiden, Netherlands
| |
Collapse
|
39
|
Reps JM, Williams RD, Schuemie MJ, Ryan PB, Rijnbeek PR. Learning patient-level prediction models across multiple healthcare databases: evaluation of ensembles for increasing model transportability. BMC Med Inform Decis Mak 2022; 22:142. [PMID: 35614485 PMCID: PMC9134686 DOI: 10.1186/s12911-022-01879-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Accepted: 05/13/2022] [Indexed: 11/29/2022] Open
Abstract
Background Prognostic models that are accurate could help aid medical decision making. Large observational databases often contain temporal medical data for large and diverse populations of patients. It may be possible to learn prognostic models using the large observational data. Often the performance of a prognostic model undesirably worsens when transported to a different database (or into a clinical setting). In this study we investigate different ensemble approaches that combine prognostic models independently developed using different databases (a simple federated learning approach) to determine whether ensembles that combine models developed across databases can improve model transportability (perform better in new data than single database models)? Methods For a given prediction question we independently trained five single database models each using a different observational healthcare database. We then developed and investigated numerous ensemble models (fusion, stacking and mixture of experts) that combined the different database models. Performance of each model was investigated via discrimination and calibration using a leave one dataset out technique, i.e., hold out one database to use for validation and use the remaining four datasets for model development. The internal validation of a model developed using the hold out database was calculated and presented as the ‘internal benchmark’ for comparison. Results In this study the fusion ensembles generally outperformed the single database models when transported to a previously unseen database and the performances were more consistent across unseen databases. Stacking ensembles performed poorly in terms of discrimination when the labels in the unseen database were limited. Calibration was consistently poor when both ensembles and single database models were applied to previously unseen databases. Conclusion A simple federated learning approach that implements ensemble techniques to combine models independently developed across different databases for the same prediction question may improve the discriminative performance in new data (new database or clinical setting) but will need to be recalibrated using the new data. This could help medical decision making by improving prognostic model performance.
Supplementary Information The online version contains supplementary material available at 10.1186/s12911-022-01879-6.
Collapse
Affiliation(s)
| | - Ross D Williams
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | | | | | - Peter R Rijnbeek
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| |
Collapse
|
40
|
Lin V, Tsouchnika A, Allakhverdiiev E, Rosen AW, Gögenur M, Clausen JSR, Bräuner KB, Walbech JS, Rijnbeek P, Drakos I, Gögenur I. Training prediction models for individual risk assessment of postoperative complications after surgery for colorectal cancer. Tech Coloproctol 2022; 26:665-675. [PMID: 35593971 DOI: 10.1007/s10151-022-02624-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Accepted: 04/20/2022] [Indexed: 12/01/2022]
Abstract
BACKGROUND The occurrence of postoperative complications and anastomotic leakage are major drivers of mortality in the immediate phase after colorectal cancer surgery. We trained prediction models for calculating patients' individual risk of complications based only on preoperatively available data in a multidisciplinary team setting. Knowing prior to surgery the probability of developing a complication could aid in improving informed decision-making by surgeon and patient and individualize surgical treatment trajectories. METHODS All patients over 18 years of age undergoing any resection for colorectal cancer between January 1, 2014 and December 31, 2019 from the nationwide Danish Colorectal Cancer Group database were included. Data from the database were converted into Observational Medical Outcomes Partnership Common Data Model maintained by the Observation Health Data Science and Informatics initiative. Multiple machine learning models were trained to predict postoperative complications of Clavien-Dindo grade ≥ 3B and anastomotic leakage within 30 days after surgery. RESULTS Between 2014 and 2019, 23,907 patients underwent resection for colorectal cancer in Denmark. A Clavien-Dindo complication grade ≥ 3B occurred in 2,958 patients (12.4%). Of 17,190 patients that received an anastomosis, 929 experienced anastomotic leakage (5.4%). Among the compared machine learning models, Lasso Logistic Regression performed best. The predictive model for complications had an area under the receiver operating characteristic curve (AUROC) of 0.704 (95%CI 0.683-0.724) and an AUROC of 0.690 (95%CI 0.655-0.724) for anastomotic leakage. CONCLUSIONS The prediction of postoperative complications based only on preoperative variables using a national quality assurance colorectal cancer database shows promise for calculating patient's individual risk. Future work will focus on assessing the value of adding laboratory parameters and drug exposure as candidate predictors. Furthermore, we plan to assess the external validity of our proposed model.
Collapse
Affiliation(s)
- V Lin
- Center for Surgical Science, Department of Surgery, Zealand University Hospital Køge, Lykkebækvej 1, 4600, Køge, Denmark.
| | - A Tsouchnika
- Center for Surgical Science, Department of Surgery, Zealand University Hospital Køge, Lykkebækvej 1, 4600, Køge, Denmark
| | - E Allakhverdiiev
- Center for Surgical Science, Department of Surgery, Zealand University Hospital Køge, Lykkebækvej 1, 4600, Køge, Denmark
| | - A W Rosen
- Center for Surgical Science, Department of Surgery, Zealand University Hospital Køge, Lykkebækvej 1, 4600, Køge, Denmark
| | - M Gögenur
- Center for Surgical Science, Department of Surgery, Zealand University Hospital Køge, Lykkebækvej 1, 4600, Køge, Denmark
| | - J S R Clausen
- Center for Surgical Science, Department of Surgery, Zealand University Hospital Køge, Lykkebækvej 1, 4600, Køge, Denmark
| | - K B Bräuner
- Center for Surgical Science, Department of Surgery, Zealand University Hospital Køge, Lykkebækvej 1, 4600, Køge, Denmark
| | - J S Walbech
- Center for Surgical Science, Department of Surgery, Zealand University Hospital Køge, Lykkebækvej 1, 4600, Køge, Denmark
| | - P Rijnbeek
- Department of Medical Informatics, Erasmus University Medical Centre, Rotterdam, The Netherlands
| | - I Drakos
- Center for Surgical Science, Department of Surgery, Zealand University Hospital Køge, Lykkebækvej 1, 4600, Køge, Denmark
| | - I Gögenur
- Center for Surgical Science, Department of Surgery, Zealand University Hospital Køge, Lykkebækvej 1, 4600, Køge, Denmark
| |
Collapse
|
41
|
Mapping Cancer Registry Data to the Episode Domain of the Observational Medical Outcomes Partnership Model (OMOP). APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12084010] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
A great challenge in the use of standardized cancer registry data is deriving reliable, evidence-based results from large amounts of data. A solution could be its mapping to a common data model such as OMOP, which represents knowledge in a unified semantic base, enabling decentralized analysis. The recently released Episode Domain of the OMOP CDM allows episodic modelling of a patient’ disease and treatment phases. In this study, we mapped oncology registry data to the Episode Domain. A total of 184,718 Episodes could be implemented to concepts of cancer drug treatment. Additionally, source data were mapped to new terminologies as part of the release. It was possible to map ≈ 73.8% of the source data to the respective OMOP standard. Best mapping was achieved in the Procedure Domain with 98.7%. To evaluate the implementation, the survival probabilities of the CDM and source system were calculated (n = 2756/2902, median OAS = 82.2/91.1 months, 95% Cl = 77.4–89.5/84.4–100.9). In conclusion, the new release of the CDM increased its applicability, especially in observational cancer research. Regarding the mapping, a higher score could be achieved if terminologies which are frequently used in Europe are included in the Standardized Vocabulary Metadata Repository.
Collapse
|
42
|
Künnapuu K, Ioannou S, Ligi K, Kolde R, Laur S, Vilo J, Rijnbeek PR, Reisberg S. Trajectories: a framework for detecting temporal clinical event sequences from health data standardized to the Observational Medical Outcomes Partnership (OMOP) Common Data Model. JAMIA Open 2022; 5:ooac021. [PMID: 35571357 PMCID: PMC9097714 DOI: 10.1093/jamiaopen/ooac021] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 02/16/2022] [Accepted: 03/05/2022] [Indexed: 11/14/2022] Open
Abstract
Objective To develop a framework for identifying temporal clinical event trajectories from Observational Medical Outcomes Partnership-formatted observational healthcare data. Materials and Methods A 4-step framework based on significant temporal event pair detection is described and implemented as an open-source R package. It is used on a population-based Estonian dataset to first replicate a large Danish population-based study and second, to conduct a disease trajectory detection study for type 2 diabetes patients in the Estonian and Dutch databases as an example. Results As a proof of concept, we apply the methods in the Estonian database and provide a detailed breakdown of our findings. All Estonian population-based event pairs are shown. We compare the event pairs identified from Estonia to Danish and Dutch data and discuss the causes of the differences. The overlap in the results was only 2.4%, which highlights the need for running similar studies in different populations. Conclusions For the first time, there is a complete software package for detecting disease trajectories in health data.
Collapse
Affiliation(s)
| | - Solomon Ioannou
- Department of Medical Informatics, Erasmus University Medical
Center, Rotterdam, the Netherlands
| | - Kadri Ligi
- STACC, Tartu, Estonia
- Institute of Computer Science, University of Tartu, Tartu,
Estonia
| | - Raivo Kolde
- Institute of Computer Science, University of Tartu, Tartu,
Estonia
| | - Sven Laur
- STACC, Tartu, Estonia
- Institute of Computer Science, University of Tartu, Tartu,
Estonia
| | - Jaak Vilo
- STACC, Tartu, Estonia
- Institute of Computer Science, University of Tartu, Tartu,
Estonia
- Quretec, Tartu, Estonia
| | - Peter R Rijnbeek
- Department of Medical Informatics, Erasmus University Medical
Center, Rotterdam, the Netherlands
| | - Sulev Reisberg
- STACC, Tartu, Estonia
- Institute of Computer Science, University of Tartu, Tartu,
Estonia
- Quretec, Tartu, Estonia
| |
Collapse
|
43
|
Jung H, Yoo S, Kim S, Heo E, Kim B, Lee HY, Hwang H. Patient-Level Fall Risk Prediction Using the Observational Medical Outcomes Partnership's Common Data Model: Pilot Feasibility Study. JMIR Med Inform 2022; 10:e35104. [PMID: 35275076 PMCID: PMC8957002 DOI: 10.2196/35104] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 01/02/2022] [Accepted: 01/31/2022] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Falls in acute care settings threaten patients' safety. Researchers have been developing fall risk prediction models and exploring risk factors to provide evidence-based fall prevention practices; however, such efforts are hindered by insufficient samples, limited covariates, and a lack of standardized methodologies that aid study replication. OBJECTIVE The objectives of this study were to (1) convert fall-related electronic health record data into the standardized Observational Medical Outcome Partnership's (OMOP) common data model format and (2) develop models that predict fall risk during 2 time periods. METHODS As a pilot feasibility test, we converted fall-related electronic health record data (nursing notes, fall risk assessment sheet, patient acuity assessment sheet, and clinical observation sheet) into standardized OMOP common data model format using an extraction, transformation, and load process. We developed fall risk prediction models for 2 time periods (within 7 days of admission and during the entire hospital stay) using 2 algorithms (least absolute shrinkage and selection operator logistic regression and random forest). RESULTS In total, 6277 nursing statements, 747,049,486 clinical observation sheet records, 1,554,775 fall risk scores, and 5,685,011 patient acuity scores were converted into OMOP common data model format. All our models (area under the receiver operating characteristic curve 0.692-0.726) performed better than the Hendrich II Fall Risk Model. Patient acuity score, fall history, age ≥60 years, movement disorder, and central nervous system agents were the most important predictors in the logistic regression models. CONCLUSIONS To enhance model performance further, we are currently converting all nursing records into the OMOP common data model data format, which will then be included in the models. Thus, in the near future, the performance of fall risk prediction models could be improved through the application of abundant nursing records and external validation.
Collapse
Affiliation(s)
- Hyesil Jung
- Office of eHealth Research and Business, Seoul National University Bundang Hospital, Seongnam-si, Republic of Korea
| | - Sooyoung Yoo
- Office of eHealth Research and Business, Seoul National University Bundang Hospital, Seongnam-si, Republic of Korea
| | - Seok Kim
- Office of eHealth Research and Business, Seoul National University Bundang Hospital, Seongnam-si, Republic of Korea
| | - Eunjeong Heo
- Office of eHealth Research and Business, Seoul National University Bundang Hospital, Seongnam-si, Republic of Korea
| | - Borham Kim
- Office of eHealth Research and Business, Seoul National University Bundang Hospital, Seongnam-si, Republic of Korea
| | - Ho-Young Lee
- Office of eHealth Research and Business, Seoul National University Bundang Hospital, Seongnam-si, Republic of Korea
| | - Hee Hwang
- Kakao Healthcare Company-In-Company, Seongnam-si, Republic of Korea
| |
Collapse
|
44
|
Guo LL, Pfohl SR, Fries J, Johnson AEW, Posada J, Aftandilian C, Shah N, Sung L. Evaluation of domain generalization and adaptation on improving model robustness to temporal dataset shift in clinical medicine. Sci Rep 2022; 12:2726. [PMID: 35177653 PMCID: PMC8854561 DOI: 10.1038/s41598-022-06484-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Accepted: 01/31/2022] [Indexed: 11/24/2022] Open
Abstract
Temporal dataset shift associated with changes in healthcare over time is a barrier to deploying machine learning-based clinical decision support systems. Algorithms that learn robust models by estimating invariant properties across time periods for domain generalization (DG) and unsupervised domain adaptation (UDA) might be suitable to proactively mitigate dataset shift. The objective was to characterize the impact of temporal dataset shift on clinical prediction models and benchmark DG and UDA algorithms on improving model robustness. In this cohort study, intensive care unit patients from the MIMIC-IV database were categorized by year groups (2008-2010, 2011-2013, 2014-2016 and 2017-2019). Tasks were predicting mortality, long length of stay, sepsis and invasive ventilation. Feedforward neural networks were used as prediction models. The baseline experiment trained models using empirical risk minimization (ERM) on 2008-2010 (ERM[08-10]) and evaluated them on subsequent year groups. DG experiment trained models using algorithms that estimated invariant properties using 2008-2016 and evaluated them on 2017-2019. UDA experiment leveraged unlabelled samples from 2017 to 2019 for unsupervised distribution matching. DG and UDA models were compared to ERM[08-16] models trained using 2008-2016. Main performance measures were area-under-the-receiver-operating-characteristic curve (AUROC), area-under-the-precision-recall curve and absolute calibration error. Threshold-based metrics including false-positives and false-negatives were used to assess the clinical impact of temporal dataset shift and its mitigation strategies. In the baseline experiments, dataset shift was most evident for sepsis prediction (maximum AUROC drop, 0.090; 95% confidence interval (CI), 0.080-0.101). Considering a scenario of 100 consecutively admitted patients showed that ERM[08-10] applied to 2017-2019 was associated with one additional false-negative among 11 patients with sepsis, when compared to the model applied to 2008-2010. When compared with ERM[08-16], DG and UDA experiments failed to produce more robust models (range of AUROC difference, - 0.003 to 0.050). In conclusion, DG and UDA failed to produce more robust models compared to ERM in the setting of temporal dataset shift. Alternate approaches are required to preserve model performance over time in clinical medicine.
Collapse
Affiliation(s)
- Lin Lawrence Guo
- Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, ON, Canada
| | - Stephen R Pfohl
- Biomedical Informatics Research, Stanford University, Palo Alto, USA
| | - Jason Fries
- Biomedical Informatics Research, Stanford University, Palo Alto, USA
| | - Alistair E W Johnson
- Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, ON, Canada
| | - Jose Posada
- Biomedical Informatics Research, Stanford University, Palo Alto, USA
| | | | - Nigam Shah
- Biomedical Informatics Research, Stanford University, Palo Alto, USA
| | - Lillian Sung
- Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, ON, Canada.
- Division of Haematology/Oncology, The Hospital for Sick Children, 555 University Avenue, Toronto, ON, M5G1X8, Canada.
| |
Collapse
|
45
|
Williams RD, Markus AF, Yang C, Duarte-Salles T, DuVall SL, Falconer T, Jonnagaddala J, Kim C, Rho Y, Williams AE, Machado AA, An MH, Aragón M, Areia C, Burn E, Choi YH, Drakos I, Abrahão MTF, Fernández-Bertolín S, Hripcsak G, Kaas-Hansen BS, Kandukuri PL, Kors JA, Kostka K, Liaw ST, Lynch KE, Machnicki G, Matheny ME, Morales D, Nyberg F, Park RW, Prats-Uribe A, Pratt N, Rao G, Reich CG, Rivera M, Seinen T, Shoaibi A, Spotnitz ME, Steyerberg EW, Suchard MA, You SC, Zhang L, Zhou L, Ryan PB, Prieto-Alhambra D, Reps JM, Rijnbeek PR. Seek COVER: using a disease proxy to rapidly develop and validate a personalized risk calculator for COVID-19 outcomes in an international network. BMC Med Res Methodol 2022; 22:35. [PMID: 35094685 PMCID: PMC8801189 DOI: 10.1186/s12874-022-01505-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Accepted: 01/03/2022] [Indexed: 12/23/2022] Open
Abstract
Background We investigated whether we could use influenza data to develop prediction models for COVID-19 to increase the speed at which prediction models can reliably be developed and validated early in a pandemic. We developed COVID-19 Estimated Risk (COVER) scores that quantify a patient’s risk of hospital admission with pneumonia (COVER-H), hospitalization with pneumonia requiring intensive services or death (COVER-I), or fatality (COVER-F) in the 30-days following COVID-19 diagnosis using historical data from patients with influenza or flu-like symptoms and tested this in COVID-19 patients. Methods We analyzed a federated network of electronic medical records and administrative claims data from 14 data sources and 6 countries containing data collected on or before 4/27/2020. We used a 2-step process to develop 3 scores using historical data from patients with influenza or flu-like symptoms any time prior to 2020. The first step was to create a data-driven model using LASSO regularized logistic regression, the covariates of which were used to develop aggregate covariates for the second step where the COVER scores were developed using a smaller set of features. These 3 COVER scores were then externally validated on patients with 1) influenza or flu-like symptoms and 2) confirmed or suspected COVID-19 diagnosis across 5 databases from South Korea, Spain, and the United States. Outcomes included i) hospitalization with pneumonia, ii) hospitalization with pneumonia requiring intensive services or death, and iii) death in the 30 days after index date. Results Overall, 44,507 COVID-19 patients were included for model validation. We identified 7 predictors (history of cancer, chronic obstructive pulmonary disease, diabetes, heart disease, hypertension, hyperlipidemia, kidney disease) which combined with age and sex discriminated which patients would experience any of our three outcomes. The models achieved good performance in influenza and COVID-19 cohorts. For COVID-19 the AUC ranges were, COVER-H: 0.69–0.81, COVER-I: 0.73–0.91, and COVER-F: 0.72–0.90. Calibration varied across the validations with some of the COVID-19 validations being less well calibrated than the influenza validations. Conclusions This research demonstrated the utility of using a proxy disease to develop a prediction model. The 3 COVER models with 9-predictors that were developed using influenza data perform well for COVID-19 patients for predicting hospitalization, intensive services, and fatality. The scores showed good discriminatory performance which transferred well to the COVID-19 population. There was some miscalibration in the COVID-19 validations, which is potentially due to the difference in symptom severity between the two diseases. A possible solution for this is to recalibrate the models in each location before use. Supplementary Information The online version contains supplementary material available at 10.1186/s12874-022-01505-z.
Collapse
|
46
|
Crossnohere NL, Elsaid M, Paskett J, Bose-Brill S, Bridges JFP. Guidelines for artificial intelligence in medicine: A literature review and content analysis of frameworks (Preprint). J Med Internet Res 2022; 24:e36823. [PMID: 36006692 PMCID: PMC9459836 DOI: 10.2196/36823] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Revised: 06/02/2022] [Accepted: 07/14/2022] [Indexed: 12/15/2022] Open
Abstract
Background Artificial intelligence (AI) is rapidly expanding in medicine despite a lack of consensus on its application and evaluation. Objective We sought to identify current frameworks guiding the application and evaluation of AI for predictive analytics in medicine and to describe the content of these frameworks. We also assessed what stages along the AI translational spectrum (ie, AI development, reporting, evaluation, implementation, and surveillance) the content of each framework has been discussed. Methods We performed a literature review of frameworks regarding the oversight of AI in medicine. The search included key topics such as “artificial intelligence,” “machine learning,” “guidance as topic,” and “translational science,” and spanned the time period 2014-2022. Documents were included if they provided generalizable guidance regarding the use or evaluation of AI in medicine. Included frameworks are summarized descriptively and were subjected to content analysis. A novel evaluation matrix was developed and applied to appraise the frameworks’ coverage of content areas across translational stages. Results Fourteen frameworks are featured in the review, including six frameworks that provide descriptive guidance and eight that provide reporting checklists for medical applications of AI. Content analysis revealed five considerations related to the oversight of AI in medicine across frameworks: transparency, reproducibility, ethics, effectiveness, and engagement. All frameworks include discussions regarding transparency, reproducibility, ethics, and effectiveness, while only half of the frameworks discuss engagement. The evaluation matrix revealed that frameworks were most likely to report AI considerations for the translational stage of development and were least likely to report considerations for the translational stage of surveillance. Conclusions Existing frameworks for the application and evaluation of AI in medicine notably offer less input on the role of engagement in oversight and regarding the translational stage of surveillance. Identifying and optimizing strategies for engagement are essential to ensure that AI can meaningfully benefit patients and other end users.
Collapse
Affiliation(s)
- Norah L Crossnohere
- Department of Biomedical Informatics, The Ohio State University College of Medicine, Columbus, OH, United States
- Division of General Internal Medicine, Department of Internal Medicine, The Ohio State University College of Medicine, Columbus, OH, United States
| | - Mohamed Elsaid
- Department of Biomedical Informatics, The Ohio State University College of Medicine, Columbus, OH, United States
| | - Jonathan Paskett
- Department of Biomedical Informatics, The Ohio State University College of Medicine, Columbus, OH, United States
| | - Seuli Bose-Brill
- Division of General Internal Medicine, Department of Internal Medicine, The Ohio State University College of Medicine, Columbus, OH, United States
| | - John F P Bridges
- Department of Biomedical Informatics, The Ohio State University College of Medicine, Columbus, OH, United States
| |
Collapse
|
47
|
Yang C, Kors JA, Ioannou S, John LH, Markus AF, Rekkas A, de Ridder MAJ, Seinen TM, Williams RD, Rijnbeek PR. Trends in the conduct and reporting of clinical prediction model development and validation: a systematic review. J Am Med Inform Assoc 2022; 29:983-989. [PMID: 35045179 PMCID: PMC9006694 DOI: 10.1093/jamia/ocac002] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Revised: 12/01/2021] [Accepted: 01/07/2022] [Indexed: 12/23/2022] Open
Abstract
Objectives This systematic review aims to provide further insights into the conduct and reporting of clinical prediction model development and validation over time. We focus on assessing the reporting of information necessary to enable external validation by other investigators. Materials and Methods We searched Embase, Medline, Web-of-Science, Cochrane Library, and Google Scholar to identify studies that developed 1 or more multivariable prognostic prediction models using electronic health record (EHR) data published in the period 2009–2019. Results We identified 422 studies that developed a total of 579 clinical prediction models using EHR data. We observed a steep increase over the years in the number of developed models. The percentage of models externally validated in the same paper remained at around 10%. Throughout 2009–2019, for both the target population and the outcome definitions, code lists were provided for less than 20% of the models. For about half of the models that were developed using regression analysis, the final model was not completely presented. Discussion Overall, we observed limited improvement over time in the conduct and reporting of clinical prediction model development and validation. In particular, the prediction problem definition was often not clearly reported, and the final model was often not completely presented. Conclusion Improvement in the reporting of information necessary to enable external validation by other investigators is still urgently needed to increase clinical adoption of developed models.
Collapse
Affiliation(s)
- Cynthia Yang
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Jan A Kors
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Solomon Ioannou
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Luis H John
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Aniek F Markus
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Alexandros Rekkas
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Maria A J de Ridder
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Tom M Seinen
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Ross D Williams
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Peter R Rijnbeek
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| |
Collapse
|
48
|
Lee DY, Kim C, Lee S, Son SJ, Cho SM, Cho YH, Lim J, Park RW. Psychosis Relapse Prediction Leveraging Electronic Health Records Data and Natural Language Processing Enrichment Methods. Front Psychiatry 2022; 13:844442. [PMID: 35479497 PMCID: PMC9037331 DOI: 10.3389/fpsyt.2022.844442] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Accepted: 03/09/2022] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Identifying patients at a high risk of psychosis relapse is crucial for early interventions. A relevant psychiatric clinical context is often recorded in clinical notes; however, the utilization of unstructured data remains limited. This study aimed to develop psychosis-relapse prediction models using various types of clinical notes and structured data. METHODS Clinical data were extracted from the electronic health records of the Ajou University Medical Center in South Korea. The study population included patients with psychotic disorders, and outcome was psychosis relapse within 1 year. Using only structured data, we developed an initial prediction model, then three natural language processing (NLP)-enriched models using three types of clinical notes (psychological tests, admission notes, and initial nursing assessment) and one complete model. Latent Dirichlet Allocation was used to cluster the clinical context into similar topics. All models applied the least absolute shrinkage and selection operator logistic regression algorithm. We also performed an external validation using another hospital database. RESULTS A total of 330 patients were included, and 62 (18.8%) experienced psychosis relapse. Six predictors were used in the initial model and 10 additional topics from Latent Dirichlet Allocation processing were added in the enriched models. The model derived from all notes showed the highest value of the area under the receiver operating characteristic (AUROC = 0.946) in the internal validation, followed by models based on the psychological test notes, admission notes, initial nursing assessments, and structured data only (0.902, 0.855, 0.798, and 0.784, respectively). The external validation was performed using only the initial nursing assessment note, and the AUROC was 0.616. CONCLUSIONS We developed prediction models for psychosis relapse using the NLP-enrichment method. Models using clinical notes were more effective than models using only structured data, suggesting the importance of unstructured data in psychosis prediction.
Collapse
Affiliation(s)
- Dong Yun Lee
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, South Korea
| | - Chungsoo Kim
- Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, South Korea
| | - Seongwon Lee
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, South Korea.,Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, South Korea
| | - Sang Joon Son
- Department of Psychiatry, Ajou University School of Medicine, Suwon, South Korea
| | - Sun-Mi Cho
- Department of Psychiatry, Ajou University School of Medicine, Suwon, South Korea
| | - Yong Hyuk Cho
- Department of Psychiatry, Ajou University School of Medicine, Suwon, South Korea
| | - Jaegyun Lim
- Department of Laboratory Medicine, Myongji Hospital, Hanyang University College of Medicine, Goyang, South Korea
| | - Rae Woong Park
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, South Korea.,Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, South Korea
| |
Collapse
|
49
|
Park C, You SC, Jeon H, Jeong CW, Choi JW, Park RW. Development and Validation of the Radiology Common Data Model (R-CDM) for the International Standardization of Medical Imaging Data. Yonsei Med J 2022; 63:S74-S83. [PMID: 35040608 PMCID: PMC8790584 DOI: 10.3349/ymj.2022.63.s74] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Revised: 10/28/2021] [Accepted: 10/31/2021] [Indexed: 12/02/2022] Open
Abstract
PURPOSE Digital Imaging and Communications in Medicine (DICOM), a standard file format for medical imaging data, contains metadata describing each file. However, metadata are often incomplete, and there is no standardized format for recording metadata, leading to inefficiency during the metadata-based data retrieval process. Here, we propose a novel standardization method for DICOM metadata termed the Radiology Common Data Model (R-CDM). MATERIALS AND METHODS R-CDM was designed to be compatible with Health Level Seven International (HL7)/Fast Healthcare Interoperability Resources (FHIR) and linked with the Observational Medical Outcomes Partnership (OMOP)-CDM to achieve a seamless link between clinical data and medical imaging data. The terminology system was standardized using the RadLex playbook, a comprehensive lexicon of radiology. As a proof of concept, the R-CDM conversion process was conducted with 41.7 TB of data from the Ajou University Hospital. The R-CDM database visualizer was developed to visualize the main characteristics of the R-CDM database. RESULTS Information from 2801360 cases and 87203226 DICOM files was organized into two tables constituting the R-CDM. Information on imaging device and image resolution was recorded with more than 99.9% accuracy. Furthermore, OMOP-CDM and R-CDM were linked to efficiently extract specific types of images from specific patient cohorts. CONCLUSION R-CDM standardizes the structure and terminology for recording medical imaging data to eliminate incomplete and unstandardized information. Successful standardization was achieved by the extract, transform, and load process and image classifier. We hope that the R-CDM will contribute to deep learning research in the medical imaging field by enabling the securement of large-scale medical imaging data from multinational institutions.
Collapse
Affiliation(s)
- ChulHyoung Park
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Korea
| | - Seng Chan You
- Department of Preventive Medicine, Yonsei University College of Medicine, Seoul, Korea
| | - Hokyun Jeon
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Korea
| | - Chang Won Jeong
- Medical Convergence Research Center, Wonkwang University, Iksan, Korea
| | - Jin Wook Choi
- Department of Radiology, Ajou University Medical Center, Suwon, Korea
| | - Rae Woong Park
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Korea
- Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, Korea.
| |
Collapse
|
50
|
Seinen TM, Fridgeirsson EA, Ioannou S, Jeannetot D, John LH, Kors JA, Markus AF, Pera V, Rekkas A, Williams RD, Yang C, van Mulligen EM, Rijnbeek PR. OUP accepted manuscript. J Am Med Inform Assoc 2022; 29:1292-1302. [PMID: 35475536 PMCID: PMC9196702 DOI: 10.1093/jamia/ocac058] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Revised: 03/06/2022] [Accepted: 04/11/2022] [Indexed: 11/29/2022] Open
Abstract
Objective This systematic review aims to assess how information from unstructured text is used to develop and validate clinical prognostic prediction models. We summarize the prediction problems and methodological landscape and determine whether using text data in addition to more commonly used structured data improves the prediction performance. Materials and Methods We searched Embase, MEDLINE, Web of Science, and Google Scholar to identify studies that developed prognostic prediction models using information extracted from unstructured text in a data-driven manner, published in the period from January 2005 to March 2021. Data items were extracted, analyzed, and a meta-analysis of the model performance was carried out to assess the added value of text to structured-data models. Results We identified 126 studies that described 145 clinical prediction problems. Combining text and structured data improved model performance, compared with using only text or only structured data. In these studies, a wide variety of dense and sparse numeric text representations were combined with both deep learning and more traditional machine learning methods. External validation, public availability, and attention for the explainability of the developed models were limited. Conclusion The use of unstructured text in the development of prognostic prediction models has been found beneficial in addition to structured data in most studies. The text data are source of valuable information for prediction model development and should not be neglected. We suggest a future focus on explainability and external validation of the developed models, promoting robust and trustworthy prediction models in clinical practice.
Collapse
Affiliation(s)
- Tom M Seinen
- Corresponding Author: Tom M. Seinen, MSc, Department of Medical Informatics, Erasmus University Medical Center, Molewaterplein 40, 3015 GD Rotterdam, The Netherlands ()
| | - Egill A Fridgeirsson
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Solomon Ioannou
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Daniel Jeannetot
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Luis H John
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Jan A Kors
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Aniek F Markus
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Victor Pera
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Alexandros Rekkas
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Ross D Williams
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Cynthia Yang
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Erik M van Mulligen
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Peter R Rijnbeek
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| |
Collapse
|