Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Gandin I, Scagnetto A, Romani S, Barbati G. Interpretability of time-series deep learning models: A study in cardiovascular patients admitted to Intensive care unit. J Biomed Inform 2021;121:103876. [PMID: 34325021 DOI: 10.1016/j.jbi.2021.103876] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 07/14/2021] [Accepted: 07/20/2021] [Indexed: 10/20/2022]

For:	Gandin I, Scagnetto A, Romani S, Barbati G. Interpretability of time-series deep learning models: A study in cardiovascular patients admitted to Intensive care unit. J Biomed Inform 2021;121:103876. [PMID: 34325021 DOI: 10.1016/j.jbi.2021.103876] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 07/14/2021] [Accepted: 07/20/2021] [Indexed: 10/20/2022]

Number

Cited by Other Article(s)

Choi H, Kim Y, Kang H, Seo H, Kim M, Han J, Kee G, Park S, Ko S, Jung H, Kim B, Roh JH, Jun TJ, Kim YH. Time series forecasting of weight for diuretic dose adjustment using bidirectional long short-term memory. Sci Rep 2024;14:17723. [PMID: 39085306 PMCID: PMC11292016 DOI: 10.1038/s41598-024-68663-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2024] [Accepted: 07/26/2024] [Indexed: 08/02/2024] Open

Affiliation(s)

Heejung Choi Department of Medical ScienceAsan Medical Institute of Convergence Science and TechnologyAsan Medical Center, University of Ulsan College of Medicine, 88, Olympicro 43Gil, Songpagu, 05505, Seoul, Republic of Korea
Yunha Kim Department of Medical ScienceAsan Medical Institute of Convergence Science and TechnologyAsan Medical Center, University of Ulsan College of Medicine, 88, Olympicro 43Gil, Songpagu, 05505, Seoul, Republic of Korea
Heejun Kang Division of Cardiology, Asan Medical Center, 88, Olympicro 43Gil, Songpagu, 05505, Seoul, Republic of Korea
Hyeram Seo Department of Information Medicine, Asan Medical Center, 88, Olympicro 43GilSongpagu, 05505, Seoul, Republic of Korea
Minkyoung Kim Department of Information Medicine, Asan Medical Center, 88, Olympicro 43GilSongpagu, 05505, Seoul, Republic of Korea
JiYe Han Department of Information Medicine, Asan Medical Center, 88, Olympicro 43GilSongpagu, 05505, Seoul, Republic of Korea
Gaeun Kee Department of Medical ScienceAsan Medical Institute of Convergence Science and TechnologyAsan Medical Center, University of Ulsan College of Medicine, 88, Olympicro 43Gil, Songpagu, 05505, Seoul, Republic of Korea
Seohyun Park Department of Medical ScienceAsan Medical Institute of Convergence Science and TechnologyAsan Medical Center, University of Ulsan College of Medicine, 88, Olympicro 43Gil, Songpagu, 05505, Seoul, Republic of Korea
Soyoung Ko Department of Medical ScienceAsan Medical Institute of Convergence Science and TechnologyAsan Medical Center, University of Ulsan College of Medicine, 88, Olympicro 43Gil, Songpagu, 05505, Seoul, Republic of Korea
HyoJe Jung Department of Medical ScienceAsan Medical Institute of Convergence Science and TechnologyAsan Medical Center, University of Ulsan College of Medicine, 88, Olympicro 43Gil, Songpagu, 05505, Seoul, Republic of Korea
Byeolhee Kim Department of Medical ScienceAsan Medical Institute of Convergence Science and TechnologyAsan Medical Center, University of Ulsan College of Medicine, 88, Olympicro 43Gil, Songpagu, 05505, Seoul, Republic of Korea
Jae-Hyung Roh Department of Internal Medicine, Chungnam National University College of Medicine, Chungnam National University Sejong Hospital, 20, Bodeum 7-Ro, Sejong-Si, 30099, Sejong, Republic of Korea
Tae Joon Jun Big Data Research Center, Asan Institute for Life Sciences, AsanMedicalCenter, 88, Olympicro 43GilSongpagu, 05505, Seoul, Republic of Korea
Young-Hak Kim Division of CardiologyDepartment of Information MedicineAsan Medical Center, University of Ulsan College of Medicine, 88, Olympicro 43GilSongpagu, 05505, Seoul, Republic of Korea.

Collapse

Diao B, Luo J, Guo Y. A comprehensive survey on deep learning-based identification and predicting the interaction mechanism of long non-coding RNAs. Brief Funct Genomics 2024;23:314-324. [PMID: 38576205 DOI: 10.1093/bfgp/elae010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 02/25/2024] [Accepted: 03/14/2024] [Indexed: 04/06/2024] Open

Lancia G, Varkila MRJ, Cremer OL, Spitoni C. Two-step interpretable modeling of ICU-AIs. Artif Intell Med 2024;151:102862. [PMID: 38579437 DOI: 10.1016/j.artmed.2024.102862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 03/25/2024] [Accepted: 03/25/2024] [Indexed: 04/07/2024]

Xiang L, Gao Z, Wang A, Shim V, Fekete G, Gu Y, Fernandez J. Rethinking running biomechanics: a critical review of ground reaction forces, tibial bone loading, and the role of wearable sensors. Front Bioeng Biotechnol 2024;12:1377383. [PMID: 38650752 PMCID: PMC11033368 DOI: 10.3389/fbioe.2024.1377383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Accepted: 03/22/2024] [Indexed: 04/25/2024] Open

Xiang L, Gu Y, Gao Z, Yu P, Shim V, Wang A, Fernandez J. Integrating an LSTM framework for predicting ankle joint biomechanics during gait using inertial sensors. Comput Biol Med 2024;170:108016. [PMID: 38277923 DOI: 10.1016/j.compbiomed.2024.108016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 01/14/2024] [Accepted: 01/19/2024] [Indexed: 01/28/2024]

Yang X, Huang K, Yang D, Zhao W, Zhou X. Biomedical Big Data Technologies, Applications, and Challenges for Precision Medicine: A Review. GLOBAL CHALLENGES (HOBOKEN, NJ) 2024;8:2300163. [PMID: 38223896 PMCID: PMC10784210 DOI: 10.1002/gch2.202300163] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 09/20/2023] [Indexed: 01/16/2024]

Wang Z, Liu J, Tian Y, Zhou T, Liu Q, Qiu Y, Li J. Integrating Medical Domain Knowledge for Early Diagnosis of Fever of Unknown Origin: An Interpretable Hierarchical Multimodal Neural Network Approach. IEEE J Biomed Health Inform 2023;27:5237-5248. [PMID: 37590111 DOI: 10.1109/jbhi.2023.3306041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/19/2023]

Abstract

Accurate and interpretable differential diagnostic technologies are crucial for supporting clinicians in decision-making and treatment-planning for patients with fever of unknown origin (FUO). Existing solutions commonly address the diagnosis of FUO by transforming it into a multi-classification task. However, after the emergence of COVID-19 pandemic, clinicians have recognized the heightened significance of early diagnosis in patients with FUO, particularly for practical needs such as early triage. This has resulted in increased demands for identifying a wider range of etiologies, shorter observation windows, and better model interpretability. In this article, we propose an interpretable hierarchical multimodal neural network framework (iHMNNF) to facilitate early diagnosis of FUO by incorporating medical domain knowledge and leveraging multimodal clinical data. The iHMNNF comprises a top-down hierarchical reasoning framework (Td-HRF) built on the class hierarchy of FUO etiologies, five local attention-based multimodal neural networks (La-MNNs) trained for each parent node of the class hierarchy, and an interpretable module based on layer-wise relevance propagation (LRP) and attention mechanism. Experimental datasets were collected from electronic health records (EHRs) at a large-scale tertiary grade-A hospital in China, comprising 34,051 hospital admissions of 30,794 FUO patients from January 2011 to October 2020. Our proposed La-MNNs achieved area under the receiver operating characteristic curve (AUROC) values ranging from 0.7809 to 0.9035 across all five decomposed tasks, surpassing competing machine learning (ML) and single-modality deep learning (DL) methods while also providing enhanced interpretability. Furthermore, we explored the feasibility of identifying FUO etiologies using only the first N-hour time series data obtained after admission.

Collapse

Pungitore S, Subbian V. Assessment of Prediction Tasks and Time Window Selection in Temporal Modeling of Electronic Health Record Data: a Systematic Review. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2023;7:313-331. [PMID: 37637723 PMCID: PMC10449760 DOI: 10.1007/s41666-023-00143-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 04/12/2023] [Accepted: 07/28/2023] [Indexed: 08/29/2023]

Abstract

Temporal electronic health record (EHR) data are often preferred for clinical prediction tasks because they offer more complete representations of a patient's pathophysiology than static data. A challenge when working with temporal EHR data is problem formulation, which includes defining the time windows of interest and the prediction task. Our objective was to conduct a systematic review that assessed the definition and reporting of concepts relevant to temporal clinical prediction tasks. We searched PubMed® and IEEE Xplore® databases for studies from January 1, 2010 applying machine learning models to EHR data for patient outcome prediction. Publications applying time-series methods were selected for further review. We identified 92 studies and summarized them by clinical context and definition and reporting of the prediction problem. For the time windows of interest, 12 studies did not discuss window lengths, 57 used a single set of window lengths, and 23 evaluated the relationship between window length and model performance. We also found that 72 studies had appropriate reporting of the prediction task. However, evaluation of prediction problem formulation for temporal EHR data was complicated by heterogeneity in assessing and reporting of these concepts. Even among studies modeling similar clinical outcomes, there were variations in terminology used to describe the prediction problem, rationale for window lengths, and determination of the outcome of interest. As temporal modeling using EHR data expands, minimal reporting standards should include time-series specific concerns to promote rigor and reproducibility in future studies and facilitate model implementation in clinical settings.

Supplementary Information

The online version contains supplementary material available at 10.1007/s41666-023-00143-4.

Collapse

Deng Y, Ma Y, Fu J, Wang X, Yu C, Lv J, Man S, Wang B, Li L. A dynamic machine learning model for prediction of NAFLD in a health checkup population: A longitudinal study. Heliyon 2023;9:e18758. [PMID: 37576311 PMCID: PMC10412833 DOI: 10.1016/j.heliyon.2023.e18758] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 07/25/2023] [Accepted: 07/26/2023] [Indexed: 08/15/2023] Open

Abstract

Background

Non-alcoholic fatty liver disease (NAFLD) is one of the most common liver diseases worldwide. Currently, most NAFLD prediction models are diagnostic models based on cross-sectional data, which failed to provide early identification or clarify causal relationships. We aimed to use time-series deep learning models with longitudinal health checkup records to predict the onset of NAFLD in the future, and update the model stepwise by incorporating new checkup records to achieve dynamic prediction.

Methods

10,493 participants with over 6 health checkup records from Beijing MJ Health Screening Center were included to conduct a retrospective cohort study, in which the constantly updated initial 5 checkup data were incorporated stepwise to predict the risk of NAFLD at and after their sixth health checkups. A total of 33 variables were considered, consisting of demographic characteristics, medical history, lifestyle, physical examinations, and laboratory tests. L1-penalized logistic regression (LR) was used for feature selection. The long short-term memory (LSTM) algorithm was introduced for model development, and five-fold cross-validation was conducted to tune and choose optimal hyperparameters. Both internal validation and external validation were conducted, using the 20% randomly divided holdout test dataset and previously unseen data from Shanghai MJ Health Screening Center, respectively, to evaluate model performance. The evaluation metrics included area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, Brier score, and decision curve. Bootstrap sampling was implemented to generate 95% confidence intervals of all the metrics. Finally, the Shapley additive explanations (SHAP) algorithm was applied in the holdout test dataset for model interpretability to obtain time-specific and sample-specific contributions of each feature.

Results

Among the 10,493 participants, 1662 (15.84%) were diagnosed with NAFLD at and after their sixth health checkups. The predictive performance of the deep learning model in the internal validation dataset improved over the incorporation of the checkups, with AUROC increasing from 0.729 (95% CI: 0.698,0.760) at baseline to 0.818 (95% CI: 0.798,0.844) when consecutive 5 checkups were included. The external validation dataset, containing 1728 participants, was used to verify the results, in which AUROC increased from 0.700 (95% CI: 0.657,0.740) with only the first checkups to 0.792 (95% CI: 0.758,0.825) with all five. The results of feature significance showed that body fat percentage, alanine transaminase (ALT), and uric acid owned the greatest impact on the outcome, time-specific, individual-specific and dynamic feature contributions were also produced for model interpretability.

Conclusion

A dynamic prediction model was successfully established in our study, and the prediction capability kept improving with the renewal of the latest checkup records. In addition, we identified key features associated with the onset of NAFLD, making it possible to optimize the prevention and control strategies of the disease in the general population.

Collapse

Affiliation(s)

Yuhan Deng Chongqing Research Institute of Big Data, Peking University, Chongqing, China Meinian Institute of Health, Beijing, China
Yuan Ma School of Population Medicine and Public Health, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
Jingzhu Fu Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China Peking University Health Science Center Meinian Public Health Institute, Beijing, China Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing, China
Xiaona Wang MJ Health Screening Center, Beijing, China
Canqing Yu Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China Peking University Health Science Center Meinian Public Health Institute, Beijing, China Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing, China Peking University Center for Public Health and Epidemic Preparedness & Response, Beijing, China
Jun Lv Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China Peking University Health Science Center Meinian Public Health Institute, Beijing, China Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing, China Peking University Center for Public Health and Epidemic Preparedness & Response, Beijing, China
Sailimai Man Meinian Institute of Health, Beijing, China Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China Peking University Health Science Center Meinian Public Health Institute, Beijing, China Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing, China
Bo Wang Meinian Institute of Health, Beijing, China Peking University Health Science Center Meinian Public Health Institute, Beijing, China Peking University Center for Public Health and Epidemic Preparedness & Response, Beijing, China
Liming Li Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China Peking University Health Science Center Meinian Public Health Institute, Beijing, China Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing, China Peking University Center for Public Health and Epidemic Preparedness & Response, Beijing, China

Collapse

Nayebi A, Tipirneni S, Reddy CK, Foreman B, Subbian V. WindowSHAP: An efficient framework for explaining time-series classifiers based on Shapley values. J Biomed Inform 2023;144:104438. [PMID: 37414368 PMCID: PMC10552726 DOI: 10.1016/j.jbi.2023.104438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 06/29/2023] [Accepted: 07/03/2023] [Indexed: 07/08/2023]

Abstract

Unpacking and comprehending how black-box machine learning algorithms (such as deep learning models) make decisions has been a persistent challenge for researchers and end-users. Explaining time-series predictive models is useful for clinical applications with high stakes to understand the behavior of prediction models, e.g., to determine how different variables and time points influence the clinical outcome. However, existing approaches to explain such models are frequently unique to architectures and data where the features do not have a time-varying component. In this paper, we introduce WindowSHAP, a model-agnostic framework for explaining time-series classifiers using Shapley values. We intend for WindowSHAP to mitigate the computational complexity of calculating Shapley values for long time-series data as well as improve the quality of explanations. WindowSHAP is based on partitioning a sequence into time windows. Under this framework, we present three distinct algorithms of Stationary, Sliding and Dynamic WindowSHAP, each evaluated against baseline approaches, KernelSHAP and TimeSHAP, using perturbation and sequence analyses metrics. We applied our framework to clinical time-series data from both a specialized clinical domain (Traumatic Brain Injury - TBI) as well as a broad clinical domain (critical care medicine). The experimental results demonstrate that, based on the two quantitative metrics, our framework is superior at explaining clinical time-series classifiers, while also reducing the complexity of computations. We show that for time-series data with 120 time steps (hours), merging 10 adjacent time points can reduce the CPU time of WindowSHAP by 80 % compared to KernelSHAP. We also show that our Dynamic WindowSHAP algorithm focuses more on the most important time steps and provides more understandable explanations. As a result, WindowSHAP not only accelerates the calculation of Shapley values for time-series data, but also delivers more understandable explanations with higher quality.

Collapse

Zou M, An Y, Kuang H, Wang J. LGTRL-DE: Local and Global Temporal Representation Learning with Demographic Embedding for in-hospital mortality prediction. J Biomed Inform 2023:104408. [PMID: 37295630 DOI: 10.1016/j.jbi.2023.104408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 03/28/2023] [Accepted: 05/28/2023] [Indexed: 06/12/2023]

Moazemi S, Vahdati S, Li J, Kalkhoff S, Castano LJV, Dewitz B, Bibo R, Sabouniaghdam P, Tootooni MS, Bundschuh RA, Lichtenberg A, Aubin H, Schmid F. Artificial intelligence for clinical decision support for monitoring patients in cardiovascular ICUs: A systematic review. Front Med (Lausanne) 2023;10:1109411. [PMID: 37064042 PMCID: PMC10102653 DOI: 10.3389/fmed.2023.1109411] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Accepted: 03/10/2023] [Indexed: 04/03/2023] Open

Abstract BackgroundArtificial intelligence (AI) and machine learning (ML) models continue to evolve the clinical decision support systems (CDSS). However, challenges arise when it comes to the integration of AI/ML into clinical scenarios. In this systematic review, we followed the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA), the population, intervention, comparator, outcome, and study design (PICOS), and the medical AI life cycle guidelines to investigate studies and tools which address AI/ML-based approaches towards clinical decision support (CDS) for monitoring cardiovascular patients in intensive care units (ICUs). We further discuss recent advances, pitfalls, and future perspectives towards effective integration of AI into routine practices as were identified and elaborated over an extensive selection process for state-of-the-art manuscripts.MethodsStudies with available English full text from PubMed and Google Scholar in the period from January 2018 to August 2022 were considered. The manuscripts were fetched through a combination of the search keywords including AI, ML, reinforcement learning (RL), deep learning, clinical decision support, and cardiovascular critical care and patients monitoring. The manuscripts were analyzed and filtered based on qualitative and quantitative criteria such as target population, proper study design, cross-validation, and risk of bias.ResultsMore than 100 queries over two medical search engines and subjective literature research were developed which identified 89 studies. After extensive assessments of the studies both technically and medically, 21 studies were selected for the final qualitative assessment.DiscussionClinical time series and electronic health records (EHR) data were the most common input modalities, while methods such as gradient boosting, recurrent neural networks (RNNs) and RL were mostly used for the analysis. Seventy-five percent of the selected papers lacked validation against external datasets highlighting the generalizability issue. Also, interpretability of the AI decisions was identified as a central issue towards effective integration of AI in healthcare. Collapse

Fischer A, Rietveld A, Teunissen P, Bakker P, Hoogendoorn M. End-to-end learning with interpretation on electrohysterography data to predict preterm birth. Comput Biol Med 2023;158:106846. [PMID: 37019011 DOI: 10.1016/j.compbiomed.2023.106846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 03/03/2023] [Accepted: 03/30/2023] [Indexed: 04/03/2023]

Deep-learning-based prognostic modeling for incident heart failure in patients with diabetes using electronic health records: A retrospective cohort study. PLoS One 2023;18:e0281878. [PMID: 36809251 PMCID: PMC9943005 DOI: 10.1371/journal.pone.0281878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 02/02/2023] [Indexed: 02/23/2023] Open

Abstract

Patients with type 2 diabetes mellitus (T2DM) have more than twice the risk of developing heart failure (HF) compared to patients without diabetes. The present study is aimed to build an artificial intelligence (AI) prognostic model that takes in account a large and heterogeneous set of clinical factors and investigates the risk of developing HF in diabetic patients. We carried out an electronic health records- (EHR-) based retrospective cohort study that included patients with cardiological clinical evaluation and no previous diagnosis of HF. Information consists of features extracted from clinical and administrative data obtained as part of routine medical care. The primary endpoint was diagnosis of HF (during out-of-hospital clinical examination or hospitalization). We developed two prognostic models using (1) elastic net regularization for Cox proportional hazard model (COX) and (2) a deep neural network survival method (PHNN), in which a neural network was used to represent a non-linear hazard function and explainability strategies are applied to estimate the influence of predictors on the risk function. Over a median follow-up of 65 months, 17.3% of the 10,614 patients developed HF. The PHNN model outperformed COX both in terms of discrimination (c-index 0.768 vs 0.734) and calibration (2-year integrated calibration index 0.008 vs 0.018). The AI approach led to the identification of 20 predictors of different domains (age, body mass index, echocardiographic and electrocardiographic features, laboratory measurements, comorbidities, therapies) whose relationship with the predicted risk correspond to known trends in the clinical practice. Our results suggest that prognostic models for HF in diabetic patients may improve using EHRs in combination with AI techniques for survival analysis, which provide high flexibility and better performance with respect to standard approaches.

Collapse

Nguyen HT, Vasconcellos HD, Keck K, Reis JP, Lewis CE, Sidney S, Lloyd-Jones DM, Schreiner PJ, Guallar E, Wu CO, Lima JA, Ambale-Venkatesh B. Multivariate longitudinal data for survival analysis of cardiovascular event prediction in young adults: insights from a comparative explainable study. BMC Med Res Methodol 2023;23:23. [PMID: 36698064 PMCID: PMC9878947 DOI: 10.1186/s12874-023-01845-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Accepted: 01/18/2023] [Indexed: 01/26/2023] Open

Abstract

BACKGROUND

Multivariate longitudinal data are under-utilized for survival analysis compared to cross-sectional data (CS - data collected once across cohort). Particularly in cardiovascular risk prediction, despite available methods of longitudinal data analysis, the value of longitudinal information has not been established in terms of improved predictive accuracy and clinical applicability.

METHODS

We investigated the value of longitudinal data over and above the use of cross-sectional data via 6 distinct modeling strategies from statistics, machine learning, and deep learning that incorporate repeated measures for survival analysis of the time-to-cardiovascular event in the Coronary Artery Risk Development in Young Adults (CARDIA) cohort. We then examined and compared the use of model-specific interpretability methods (Random Survival Forest Variable Importance) and model-agnostic methods (SHapley Additive exPlanation (SHAP) and Temporal Importance Model Explanation (TIME)) in cardiovascular risk prediction using the top-performing models.

RESULTS

In a cohort of 3539 participants, longitudinal information from 35 variables that were repeatedly collected in 6 exam visits over 15 years improved subsequent long-term (17 years after) risk prediction by up to 8.3% in C-index compared to using baseline data (0.78 vs. 0.72), and up to approximately 4% compared to using the last observed CS data (0.75). Time-varying AUC was also higher in models using longitudinal data (0.86-0.87 at 5 years, 0.79-0.81 at 10 years) than using baseline or last observed CS data (0.80-0.86 at 5 years, 0.73-0.77 at 10 years). Comparative model interpretability analysis revealed the impact of longitudinal variables on model prediction on both the individual and global scales among different modeling strategies, as well as identifying the best time windows and best timing within that window for event prediction. The best strategy to incorporate longitudinal data for accuracy was time series massive feature extraction, and the easiest interpretable strategy was trajectory clustering.

CONCLUSION

Our analysis demonstrates the added value of longitudinal data in predictive accuracy and epidemiological utility in cardiovascular risk survival analysis in young adults via a unified, scalable framework that compares model performance and explainability. The framework can be extended to a larger number of variables and other longitudinal modeling methods.

TRIAL REGISTRATION

ClinicalTrials.gov Identifier: NCT00005130, Registration Date: 26/05/2000.

Collapse

Early Prediction in Classification of Cardiovascular Diseases with Machine Learning, Neuro-Fuzzy and Statistical Methods. BIOLOGY 2023;12:biology12010117. [PMID: 36671809 PMCID: PMC9855428 DOI: 10.3390/biology12010117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/12/2022] [Revised: 01/06/2023] [Accepted: 01/08/2023] [Indexed: 01/15/2023]

Chen Q, Li R, Lin C, Lai C, Chen D, Qu H, Huang Y, Lu W, Tang Y, Li L. Transferability and interpretability of the sepsis prediction models in the intensive care unit. BMC Med Inform Decis Mak 2022;22:343. [PMID: 36581881 PMCID: PMC9798724 DOI: 10.1186/s12911-022-02090-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Accepted: 12/16/2022] [Indexed: 12/31/2022] Open

Abstract

BACKGROUND

We aimed to develop an early warning system for real-time sepsis prediction in the ICU by machine learning methods, with tools for interpretative analysis of the predictions. In particular, we focus on the deployment of the system in a target medical center with small historical samples.

METHODS

Light Gradient Boosting Machine (LightGBM) and multilayer perceptron (MLP) were trained on Medical Information Mart for Intensive Care (MIMIC-III) dataset and then finetuned on the private Historical Database of local Ruijin Hospital (HDRJH) using transfer learning technique. The Shapley Additive Explanations (SHAP) analysis was employed to characterize the feature importance in the prediction inference. Ultimately, the performance of the sepsis prediction system was further evaluated in the real-world study in the ICU of the target Ruijin Hospital.

RESULTS

The datasets comprised 6891 patients from MIMIC-III, 453 from HDRJH, and 67 from Ruijin real-world data. The area under the receiver operating characteristic curves (AUCs) for LightGBM and MLP models derived from MIMIC-III were 0.98 - 0.98 and 0.95 - 0.96 respectively on MIMIC-III dataset, and, in comparison, 0.82 - 0.86 and 0.84 - 0.87 respectively on HDRJH, from 1 to 5 h preceding. After transfer learning and ensemble learning, the AUCs of the final ensemble model were enhanced to 0.94 - 0.94 on HDRJH and to 0.86 - 0.9 in the real-world study in the ICU of the target Ruijin Hospital. In addition, the SHAP analysis illustrated the importance of age, antibiotics, net balance, and ventilation for sepsis prediction, making the model interpretable.

CONCLUSIONS

Our machine learning model allows accurate real-time prediction of sepsis within 5-h preceding. Transfer learning can effectively improve the feasibility to deploy the prediction model in the target cohort, and ameliorate the model performance for external validation. SHAP analysis indicates that the role of antibiotic usage and fluid management needs further investigation. We argue that our system and methodology have the potential to improve ICU management by helping medical practitioners identify at-sepsis-risk patients and prepare for timely diagnosis and intervention.

TRIAL REGISTRATION

NCT05088850 (retrospectively registered).

Collapse

Multilayer dynamic ensemble model for intensive care unit mortality prediction of neonate patients. J Biomed Inform 2022;135:104216. [DOI: 10.1016/j.jbi.2022.104216] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Revised: 09/25/2022] [Accepted: 09/28/2022] [Indexed: 12/26/2022]

Coombes CE, Coombes KR, Fareed N. Sequences of Events from the Electronic Medical Record and the Onset of Infection. Chem Biodivers 2022;19:e202200657. [PMID: 36216587 DOI: 10.1002/cbdv.202200657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Accepted: 09/15/2022] [Indexed: 11/06/2022]

Deng Y, Liu S, Wang Z, Wang Y, Jiang Y, Liu B. Explainable time-series deep learning models for the prediction of mortality, prolonged length of stay and 30-day readmission in intensive care patients. Front Med (Lausanne) 2022;9:933037. [PMID: 36250092 PMCID: PMC9554013 DOI: 10.3389/fmed.2022.933037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Accepted: 09/01/2022] [Indexed: 11/14/2022] Open

Abstract

Background

In-hospital mortality, prolonged length of stay (LOS), and 30-day readmission are common outcomes in the intensive care unit (ICU). Traditional scoring systems and machine learning models for predicting these outcomes usually ignore the characteristics of ICU data, which are time-series forms. We aimed to use time-series deep learning models with the selective combination of three widely used scoring systems to predict these outcomes.

Materials and methods

A retrospective cohort study was conducted on 40,083 patients in ICU from the Medical Information Mart for Intensive Care-IV (MIMIC-IV) database. Three deep learning models, namely, recurrent neural network (RNN), gated recurrent unit (GRU), and long short-term memory (LSTM) with attention mechanisms, were trained for the prediction of in-hospital mortality, prolonged LOS, and 30-day readmission with variables collected during the initial 24 h after ICU admission or the last 24 h before discharge. The inclusion of variables was based on three widely used scoring systems, namely, APACHE II, SOFA, and SAPS II, and the predictors consisted of time-series vital signs, laboratory tests, medication, and procedures. The patients were randomly divided into a training set (80%) and a test set (20%), which were used for model development and model evaluation, respectively. The area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and Brier scores were used to evaluate model performance. Variable significance was identified through attention mechanisms.

Results

A total of 33 variables for 40,083 patients were enrolled for mortality and prolonged LOS prediction and 36,180 for readmission prediction. The rates of occurrence of the three outcomes were 9.74%, 27.54%, and 11.79%, respectively. In each of the three outcomes, the performance of RNN, GRU, and LSTM did not differ greatly. Mortality prediction models, prolonged LOS prediction models, and readmission prediction models achieved AUCs of 0.870 ± 0.001, 0.765 ± 0.003, and 0.635 ± 0.018, respectively. The top significant variables co-selected by the three deep learning models were Glasgow Coma Scale (GCS), age, blood urea nitrogen, and norepinephrine for mortality; GCS, invasive ventilation, and blood urea nitrogen for prolonged LOS; and blood urea nitrogen, GCS, and ethnicity for readmission.

Conclusion

The prognostic prediction models established in our study achieved good performance in predicting common outcomes of patients in ICU, especially in mortality prediction. In addition, GCS and blood urea nitrogen were identified as the most important factors strongly associated with adverse ICU events.

Collapse

邓宇, 姜勇, 王子, 刘爽, 汪雨, 刘宝. [Long short-term memory and Logistic regression for mortality risk prediction of intensive care unit patients with stroke]. BEIJING DA XUE XUE BAO. YI XUE BAN = JOURNAL OF PEKING UNIVERSITY. HEALTH SCIENCES 2022;54:458-467. [PMID: 35701122 PMCID: PMC9197695 DOI: 10.19723/j.issn.1671-167x.2022.03.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Indexed: 06/15/2023]

Abstract

OBJECTIVE

To select variables related to mortality risk of stroke patients in intensive care unit (ICU) through long short-term memory (LSTM) with attention mechanisms and Logistic regression with L1 norm, and to construct mortality risk prediction model based on conventional Logistic regression with important variables selected from the two models and to evaluate the model performance.

METHODS

Medical Information Mart for Intensive Care (MIMIC)-Ⅳ database was retrospectively analyzed and the patients who were primarily diagnosed with stroke were selected as study population. The outcome was defined as whether the patient died in hospital after admission. Candidate predictors included demogra-phic information, complications, laboratory tests and vital signs in the initial 48 h after ICU admission. The data were randomly divided into a training set and a test set for ten times at a ratio of 8 ∶2. In training sets, LSTM with attention mechanisms and Logistic regression with L1 norm were constructed to select important variables. In the test sets, the mean importance of variables of ten times was used as a reference to pick out the top 10 variables in each of the two models, and then these variables were included in conventional Logistic regression to build the final prediction model. Model evaluation was based on the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and accuracy. And the model performance was compared with the forward Logistic regression model which hadn't conducted variable selection previously.

RESULTS

A total of 2 755 patients with 2 979 ICU admission records were included in the analysis, of which 526 recorded deaths. The AUC of Logistic regression model with L1 norm was statistically better than that of LSTM with attention mechanisms (0.819±0.031 vs. 0.760±0.018, P < 0.001). Age, blood glucose, and blood urea nitrogen were at the top ten important variables in both of the two models. AUC, sensitivity, specificity, and accuracy of Logistic regression models were 0.85, 85.98%, 71.74% and 74.26%, respectively. And the final prediction model was superior to forward Logistic regression model.

CONCLUSION

The variables selected by Logistic regression with L1 norm and LSTM with attention mechanisms had good prediction performance, which showed important implications on the mortality prediction of stroke patients in ICU.

Collapse