1
|
Choi H, Kim Y, Kang H, Seo H, Kim M, Han J, Kee G, Park S, Ko S, Jung H, Kim B, Roh JH, Jun TJ, Kim YH. Time series forecasting of weight for diuretic dose adjustment using bidirectional long short-term memory. Sci Rep 2024; 14:17723. [PMID: 39085306 PMCID: PMC11292016 DOI: 10.1038/s41598-024-68663-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2024] [Accepted: 07/26/2024] [Indexed: 08/02/2024] Open
Abstract
Loop diuretics are prevailing drugs to manage fluid overload in heart failure. However, adjusting to loop diuretic doses is strenuous due to the lack of a diuretic guideline. Accordingly, we developed a novel clinician decision support system for adjusting loop diuretics dosage with a Long Short-Term Memory (LSTM) algorithm using time-series EMRs. Weight measurements were used as the target to estimate fluid loss during diuretic therapy. We designed the TSFD-LSTM, a bi-directional LSTM model with an attention mechanism, to forecast weight change 48 h after heart failure patients were injected with loop diuretics. The model utilized 65 variables, including disease conditions, concurrent medications, laboratory results, vital signs, and physical measurements from EMRs. The framework processed four sequences simultaneously as inputs. An ablation study on attention mechanisms and a comparison with the transformer model as a baseline were conducted. The TSFD-LSTM outperformed the other models, achieving 85% predictive accuracy with MAE and MSE values of 0.56 and 1.45, respectively. Thus, the TSFD-LSTM model can aid in personalized loop diuretic treatment and prevent adverse drug events, contributing to improved healthcare efficacy for heart failure patients.
Collapse
Affiliation(s)
- Heejung Choi
- Department of Medical ScienceAsan Medical Institute of Convergence Science and TechnologyAsan Medical Center, University of Ulsan College of Medicine, 88, Olympicro 43Gil, Songpagu, 05505, Seoul, Republic of Korea
| | - Yunha Kim
- Department of Medical ScienceAsan Medical Institute of Convergence Science and TechnologyAsan Medical Center, University of Ulsan College of Medicine, 88, Olympicro 43Gil, Songpagu, 05505, Seoul, Republic of Korea
| | - Heejun Kang
- Division of Cardiology, Asan Medical Center, 88, Olympicro 43Gil, Songpagu, 05505, Seoul, Republic of Korea
| | - Hyeram Seo
- Department of Information Medicine, Asan Medical Center, 88, Olympicro 43GilSongpagu, 05505, Seoul, Republic of Korea
| | - Minkyoung Kim
- Department of Information Medicine, Asan Medical Center, 88, Olympicro 43GilSongpagu, 05505, Seoul, Republic of Korea
| | - JiYe Han
- Department of Information Medicine, Asan Medical Center, 88, Olympicro 43GilSongpagu, 05505, Seoul, Republic of Korea
| | - Gaeun Kee
- Department of Medical ScienceAsan Medical Institute of Convergence Science and TechnologyAsan Medical Center, University of Ulsan College of Medicine, 88, Olympicro 43Gil, Songpagu, 05505, Seoul, Republic of Korea
| | - Seohyun Park
- Department of Medical ScienceAsan Medical Institute of Convergence Science and TechnologyAsan Medical Center, University of Ulsan College of Medicine, 88, Olympicro 43Gil, Songpagu, 05505, Seoul, Republic of Korea
| | - Soyoung Ko
- Department of Medical ScienceAsan Medical Institute of Convergence Science and TechnologyAsan Medical Center, University of Ulsan College of Medicine, 88, Olympicro 43Gil, Songpagu, 05505, Seoul, Republic of Korea
| | - HyoJe Jung
- Department of Medical ScienceAsan Medical Institute of Convergence Science and TechnologyAsan Medical Center, University of Ulsan College of Medicine, 88, Olympicro 43Gil, Songpagu, 05505, Seoul, Republic of Korea
| | - Byeolhee Kim
- Department of Medical ScienceAsan Medical Institute of Convergence Science and TechnologyAsan Medical Center, University of Ulsan College of Medicine, 88, Olympicro 43Gil, Songpagu, 05505, Seoul, Republic of Korea
| | - Jae-Hyung Roh
- Department of Internal Medicine, Chungnam National University College of Medicine, Chungnam National University Sejong Hospital, 20, Bodeum 7-Ro, Sejong-Si, 30099, Sejong, Republic of Korea
| | - Tae Joon Jun
- Big Data Research Center, Asan Institute for Life Sciences, AsanMedicalCenter, 88, Olympicro 43GilSongpagu, 05505, Seoul, Republic of Korea
| | - Young-Hak Kim
- Division of CardiologyDepartment of Information MedicineAsan Medical Center, University of Ulsan College of Medicine, 88, Olympicro 43GilSongpagu, 05505, Seoul, Republic of Korea.
| |
Collapse
|
2
|
Diao B, Luo J, Guo Y. A comprehensive survey on deep learning-based identification and predicting the interaction mechanism of long non-coding RNAs. Brief Funct Genomics 2024; 23:314-324. [PMID: 38576205 DOI: 10.1093/bfgp/elae010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 02/25/2024] [Accepted: 03/14/2024] [Indexed: 04/06/2024] Open
Abstract
Long noncoding RNAs (lncRNAs) have been discovered to be extensively involved in eukaryotic epigenetic, transcriptional, and post-transcriptional regulatory processes with the advancements in sequencing technology and genomics research. Therefore, they play crucial roles in the body's normal physiology and various disease outcomes. Presently, numerous unknown lncRNA sequencing data require exploration. Establishing deep learning-based prediction models for lncRNAs provides valuable insights for researchers, substantially reducing time and costs associated with trial and error and facilitating the disease-relevant lncRNA identification for prognosis analysis and targeted drug development as the era of artificial intelligence progresses. However, most lncRNA-related researchers lack awareness of the latest advancements in deep learning models and model selection and application in functional research on lncRNAs. Thus, we elucidate the concept of deep learning models, explore several prevalent deep learning algorithms and their data preferences, conduct a comprehensive review of recent literature studies with exemplary predictive performance over the past 5 years in conjunction with diverse prediction functions, critically analyze and discuss the merits and limitations of current deep learning models and solutions, while also proposing prospects based on cutting-edge advancements in lncRNA research.
Collapse
Affiliation(s)
- Biyu Diao
- Department of Breast Surgery, The First Affiliated Hospital of Ningbo University, No. 59, Liuting Street, Haishu District, Ningbo 315000, China
| | - Jin Luo
- Department of Breast Surgery, The First Affiliated Hospital of Ningbo University, No. 59, Liuting Street, Haishu District, Ningbo 315000, China
| | - Yu Guo
- Department of Breast Surgery, The First Affiliated Hospital of Ningbo University, No. 59, Liuting Street, Haishu District, Ningbo 315000, China
| |
Collapse
|
3
|
Lancia G, Varkila MRJ, Cremer OL, Spitoni C. Two-step interpretable modeling of ICU-AIs. Artif Intell Med 2024; 151:102862. [PMID: 38579437 DOI: 10.1016/j.artmed.2024.102862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 03/25/2024] [Accepted: 03/25/2024] [Indexed: 04/07/2024]
Abstract
We present a novel methodology for integrating high resolution longitudinal data with the dynamic prediction capabilities of survival models. The aim is two-fold: to improve the predictive power while maintaining the interpretability of the models. To go beyond the black box paradigm of artificial neural networks, we propose a parsimonious and robust semi-parametric approach (i.e., a landmarking competing risks model) that combines routinely collected low-resolution data with predictive features extracted from a convolutional neural network, that was trained on high resolution time-dependent information. We then use saliency maps to analyze and explain the extra predictive power of this model. To illustrate our methodology, we focus on healthcare-associated infections in patients admitted to an intensive care unit.
Collapse
Affiliation(s)
- G Lancia
- Mathematics Department, Utrecht University, Budapestlaan, 6, Utrecht, 3584CD, The Netherlands.
| | - M R J Varkila
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Universiteitsweg 100, Utrecht, 3584 CG, The Netherlands
| | - O L Cremer
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Universiteitsweg 100, Utrecht, 3584 CG, The Netherlands
| | - C Spitoni
- Mathematics Department, Utrecht University, Budapestlaan, 6, Utrecht, 3584CD, The Netherlands
| |
Collapse
|
4
|
Xiang L, Gao Z, Wang A, Shim V, Fekete G, Gu Y, Fernandez J. Rethinking running biomechanics: a critical review of ground reaction forces, tibial bone loading, and the role of wearable sensors. Front Bioeng Biotechnol 2024; 12:1377383. [PMID: 38650752 PMCID: PMC11033368 DOI: 10.3389/fbioe.2024.1377383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Accepted: 03/22/2024] [Indexed: 04/25/2024] Open
Abstract
This study presents a comprehensive review of the correlation between tibial acceleration (TA), ground reaction forces (GRF), and tibial bone loading, emphasizing the critical role of wearable sensor technology in accurately measuring these biomechanical forces in the context of running. This systematic review and meta-analysis searched various electronic databases (PubMed, SPORTDiscus, Scopus, IEEE Xplore, and ScienceDirect) to identify relevant studies. It critically evaluates existing research on GRF and tibial acceleration (TA) as indicators of running-related injuries, revealing mixed findings. Intriguingly, recent empirical data indicate only a marginal link between GRF, TA, and tibial bone stress, thus challenging the conventional understanding in this field. The study also highlights the limitations of current biomechanical models and methodologies, proposing a paradigm shift towards more holistic and integrated approaches. The study underscores wearable sensors' potential, enhanced by machine learning, in transforming the monitoring, prevention, and rehabilitation of running-related injuries.
Collapse
Affiliation(s)
- Liangliang Xiang
- Department of Radiology, Ningbo No. 2 Hospital, Ningbo, China
- Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| | - Zixiang Gao
- Department of Radiology, Ningbo No. 2 Hospital, Ningbo, China
- Faculty of Engineering, University of Pannonia, Veszprém, Hungary
| | - Alan Wang
- Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
- Center for Medical Imaging, Faculty of Medical and Health Sciences, The University of Auckland, Auckland, New Zealand
| | - Vickie Shim
- Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| | - Gusztáv Fekete
- Vehicle Industry Research Center, Széchenyi István University, Győr, Hungary
| | - Yaodong Gu
- Department of Radiology, Ningbo No. 2 Hospital, Ningbo, China
- Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
- Faculty of Sports Science, Ningbo University, Ningbo, China
| | - Justin Fernandez
- Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
- Department of Engineering Science, The University of Auckland, Auckland, New Zealand
| |
Collapse
|
5
|
Xiang L, Gu Y, Gao Z, Yu P, Shim V, Wang A, Fernandez J. Integrating an LSTM framework for predicting ankle joint biomechanics during gait using inertial sensors. Comput Biol Med 2024; 170:108016. [PMID: 38277923 DOI: 10.1016/j.compbiomed.2024.108016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 01/14/2024] [Accepted: 01/19/2024] [Indexed: 01/28/2024]
Abstract
The ankle joint plays a crucial role in gait, facilitating the articulation of the lower limb, maintaining foot-ground contact, balancing the body, and transmitting the center of gravity. This study aimed to implement long short-term memory (LSTM) networks for predicting ankle joint angles, torques, and contact forces using inertial measurement unit (IMU) sensors. Twenty-five healthy participants were recruited. Two IMU sensors were attached to the foot dorsum and the vertical axis of the distal anteromedial tibia in the right lower limb to record acceleration and angular velocity during running. We proposed a LSTM-MLP (multilayer perceptron) model for training time-series data from IMU sensors and predicting ankle joint biomechanics. The model underwent validation and testing using a custom nested k-fold cross-validation process. The average values of the coefficient of determination (R2), mean absolute error (MAE), and mean squared error (MSE) for ankle dorsiflexion joint and moment, subtalar inversion joint and moment, and ankle joint contact forces were 0.89 ± 0.04, 0.75 ± 1.04, and 2.96 ± 4.96 for walking, and 0.87 ± 0.07, 0.88 ± 1.26, and 4.1 ± 7.17 for running, respectively. This study demonstrates that IMU sensors, combined with LSTM neural networks, are invaluable tools for evaluating ankle joint biomechanics in lower limb pathological diagnosis and rehabilitation, offering a cost-effective and versatile alternative to traditional experimental settings.
Collapse
Affiliation(s)
- Liangliang Xiang
- Faculty of Sports Science, Ningbo University, Ningbo, China; Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| | - Yaodong Gu
- Faculty of Sports Science, Ningbo University, Ningbo, China; Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand.
| | - Zixiang Gao
- Faculty of Sports Science, Ningbo University, Ningbo, China; Faculty of Engineering, University of Pannonia, Veszprém, Hungary
| | - Peimin Yu
- Faculty of Sports Science, Ningbo University, Ningbo, China; Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| | - Vickie Shim
- Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| | - Alan Wang
- Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand; Center for Medical Imaging, Faculty of Medical and Health Sciences, The University of Auckland, Auckland, New Zealand
| | - Justin Fernandez
- Faculty of Sports Science, Ningbo University, Ningbo, China; Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand; Department of Engineering Science, The University of Auckland, Auckland, New Zealand
| |
Collapse
|
6
|
Yang X, Huang K, Yang D, Zhao W, Zhou X. Biomedical Big Data Technologies, Applications, and Challenges for Precision Medicine: A Review. GLOBAL CHALLENGES (HOBOKEN, NJ) 2024; 8:2300163. [PMID: 38223896 PMCID: PMC10784210 DOI: 10.1002/gch2.202300163] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 09/20/2023] [Indexed: 01/16/2024]
Abstract
The explosive growth of biomedical Big Data presents both significant opportunities and challenges in the realm of knowledge discovery and translational applications within precision medicine. Efficient management, analysis, and interpretation of big data can pave the way for groundbreaking advancements in precision medicine. However, the unprecedented strides in the automated collection of large-scale molecular and clinical data have also introduced formidable challenges in terms of data analysis and interpretation, necessitating the development of novel computational approaches. Some potential challenges include the curse of dimensionality, data heterogeneity, missing data, class imbalance, and scalability issues. This overview article focuses on the recent progress and breakthroughs in the application of big data within precision medicine. Key aspects are summarized, including content, data sources, technologies, tools, challenges, and existing gaps. Nine fields-Datawarehouse and data management, electronic medical record, biomedical imaging informatics, Artificial intelligence-aided surgical design and surgery optimization, omics data, health monitoring data, knowledge graph, public health informatics, and security and privacy-are discussed.
Collapse
Affiliation(s)
- Xue Yang
- Department of Pancreatic Surgery and West China Biomedical Big Data CenterWest China HospitalSichuan UniversityChengdu610041China
| | - Kexin Huang
- Department of Pancreatic Surgery and West China Biomedical Big Data CenterWest China HospitalSichuan UniversityChengdu610041China
| | - Dewei Yang
- College of Advanced Manufacturing EngineeringChongqing University of Posts and TelecommunicationsChongqingChongqing400000China
| | - Weiling Zhao
- Center for Systems MedicineSchool of Biomedical InformaticsUTHealth at HoustonHoustonTX77030USA
| | - Xiaobo Zhou
- Center for Systems MedicineSchool of Biomedical InformaticsUTHealth at HoustonHoustonTX77030USA
| |
Collapse
|
7
|
Wang Z, Liu J, Tian Y, Zhou T, Liu Q, Qiu Y, Li J. Integrating Medical Domain Knowledge for Early Diagnosis of Fever of Unknown Origin: An Interpretable Hierarchical Multimodal Neural Network Approach. IEEE J Biomed Health Inform 2023; 27:5237-5248. [PMID: 37590111 DOI: 10.1109/jbhi.2023.3306041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/19/2023]
Abstract
Accurate and interpretable differential diagnostic technologies are crucial for supporting clinicians in decision-making and treatment-planning for patients with fever of unknown origin (FUO). Existing solutions commonly address the diagnosis of FUO by transforming it into a multi-classification task. However, after the emergence of COVID-19 pandemic, clinicians have recognized the heightened significance of early diagnosis in patients with FUO, particularly for practical needs such as early triage. This has resulted in increased demands for identifying a wider range of etiologies, shorter observation windows, and better model interpretability. In this article, we propose an interpretable hierarchical multimodal neural network framework (iHMNNF) to facilitate early diagnosis of FUO by incorporating medical domain knowledge and leveraging multimodal clinical data. The iHMNNF comprises a top-down hierarchical reasoning framework (Td-HRF) built on the class hierarchy of FUO etiologies, five local attention-based multimodal neural networks (La-MNNs) trained for each parent node of the class hierarchy, and an interpretable module based on layer-wise relevance propagation (LRP) and attention mechanism. Experimental datasets were collected from electronic health records (EHRs) at a large-scale tertiary grade-A hospital in China, comprising 34,051 hospital admissions of 30,794 FUO patients from January 2011 to October 2020. Our proposed La-MNNs achieved area under the receiver operating characteristic curve (AUROC) values ranging from 0.7809 to 0.9035 across all five decomposed tasks, surpassing competing machine learning (ML) and single-modality deep learning (DL) methods while also providing enhanced interpretability. Furthermore, we explored the feasibility of identifying FUO etiologies using only the first N-hour time series data obtained after admission.
Collapse
|
8
|
Pungitore S, Subbian V. Assessment of Prediction Tasks and Time Window Selection in Temporal Modeling of Electronic Health Record Data: a Systematic Review. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2023; 7:313-331. [PMID: 37637723 PMCID: PMC10449760 DOI: 10.1007/s41666-023-00143-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 04/12/2023] [Accepted: 07/28/2023] [Indexed: 08/29/2023]
Abstract
Temporal electronic health record (EHR) data are often preferred for clinical prediction tasks because they offer more complete representations of a patient's pathophysiology than static data. A challenge when working with temporal EHR data is problem formulation, which includes defining the time windows of interest and the prediction task. Our objective was to conduct a systematic review that assessed the definition and reporting of concepts relevant to temporal clinical prediction tasks. We searched PubMed® and IEEE Xplore® databases for studies from January 1, 2010 applying machine learning models to EHR data for patient outcome prediction. Publications applying time-series methods were selected for further review. We identified 92 studies and summarized them by clinical context and definition and reporting of the prediction problem. For the time windows of interest, 12 studies did not discuss window lengths, 57 used a single set of window lengths, and 23 evaluated the relationship between window length and model performance. We also found that 72 studies had appropriate reporting of the prediction task. However, evaluation of prediction problem formulation for temporal EHR data was complicated by heterogeneity in assessing and reporting of these concepts. Even among studies modeling similar clinical outcomes, there were variations in terminology used to describe the prediction problem, rationale for window lengths, and determination of the outcome of interest. As temporal modeling using EHR data expands, minimal reporting standards should include time-series specific concerns to promote rigor and reproducibility in future studies and facilitate model implementation in clinical settings. Supplementary Information The online version contains supplementary material available at 10.1007/s41666-023-00143-4.
Collapse
Affiliation(s)
- Sarah Pungitore
- Program in Applied Mathematics, Department of Mathematics, 617 N Santa Rita Ave, Tucson, AZ 85721 USA
| | - Vignesh Subbian
- Department of Biomedical Engineering, The University of Arizona, Tucson, AZ 85721-0020 USA
- Department of Systems and Industrial Engineering, The University of Arizona, Tucson, AZ 85721-0020 USA
| |
Collapse
|
9
|
Deng Y, Ma Y, Fu J, Wang X, Yu C, Lv J, Man S, Wang B, Li L. A dynamic machine learning model for prediction of NAFLD in a health checkup population: A longitudinal study. Heliyon 2023; 9:e18758. [PMID: 37576311 PMCID: PMC10412833 DOI: 10.1016/j.heliyon.2023.e18758] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 07/25/2023] [Accepted: 07/26/2023] [Indexed: 08/15/2023] Open
Abstract
Background Non-alcoholic fatty liver disease (NAFLD) is one of the most common liver diseases worldwide. Currently, most NAFLD prediction models are diagnostic models based on cross-sectional data, which failed to provide early identification or clarify causal relationships. We aimed to use time-series deep learning models with longitudinal health checkup records to predict the onset of NAFLD in the future, and update the model stepwise by incorporating new checkup records to achieve dynamic prediction. Methods 10,493 participants with over 6 health checkup records from Beijing MJ Health Screening Center were included to conduct a retrospective cohort study, in which the constantly updated initial 5 checkup data were incorporated stepwise to predict the risk of NAFLD at and after their sixth health checkups. A total of 33 variables were considered, consisting of demographic characteristics, medical history, lifestyle, physical examinations, and laboratory tests. L1-penalized logistic regression (LR) was used for feature selection. The long short-term memory (LSTM) algorithm was introduced for model development, and five-fold cross-validation was conducted to tune and choose optimal hyperparameters. Both internal validation and external validation were conducted, using the 20% randomly divided holdout test dataset and previously unseen data from Shanghai MJ Health Screening Center, respectively, to evaluate model performance. The evaluation metrics included area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, Brier score, and decision curve. Bootstrap sampling was implemented to generate 95% confidence intervals of all the metrics. Finally, the Shapley additive explanations (SHAP) algorithm was applied in the holdout test dataset for model interpretability to obtain time-specific and sample-specific contributions of each feature. Results Among the 10,493 participants, 1662 (15.84%) were diagnosed with NAFLD at and after their sixth health checkups. The predictive performance of the deep learning model in the internal validation dataset improved over the incorporation of the checkups, with AUROC increasing from 0.729 (95% CI: 0.698,0.760) at baseline to 0.818 (95% CI: 0.798,0.844) when consecutive 5 checkups were included. The external validation dataset, containing 1728 participants, was used to verify the results, in which AUROC increased from 0.700 (95% CI: 0.657,0.740) with only the first checkups to 0.792 (95% CI: 0.758,0.825) with all five. The results of feature significance showed that body fat percentage, alanine transaminase (ALT), and uric acid owned the greatest impact on the outcome, time-specific, individual-specific and dynamic feature contributions were also produced for model interpretability. Conclusion A dynamic prediction model was successfully established in our study, and the prediction capability kept improving with the renewal of the latest checkup records. In addition, we identified key features associated with the onset of NAFLD, making it possible to optimize the prevention and control strategies of the disease in the general population.
Collapse
Affiliation(s)
- Yuhan Deng
- Chongqing Research Institute of Big Data, Peking University, Chongqing, China
- Meinian Institute of Health, Beijing, China
| | - Yuan Ma
- School of Population Medicine and Public Health, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | - Jingzhu Fu
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
- Peking University Health Science Center Meinian Public Health Institute, Beijing, China
- Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing, China
| | | | - Canqing Yu
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
- Peking University Health Science Center Meinian Public Health Institute, Beijing, China
- Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing, China
- Peking University Center for Public Health and Epidemic Preparedness & Response, Beijing, China
| | - Jun Lv
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
- Peking University Health Science Center Meinian Public Health Institute, Beijing, China
- Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing, China
- Peking University Center for Public Health and Epidemic Preparedness & Response, Beijing, China
| | - Sailimai Man
- Meinian Institute of Health, Beijing, China
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
- Peking University Health Science Center Meinian Public Health Institute, Beijing, China
- Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing, China
| | - Bo Wang
- Meinian Institute of Health, Beijing, China
- Peking University Health Science Center Meinian Public Health Institute, Beijing, China
- Peking University Center for Public Health and Epidemic Preparedness & Response, Beijing, China
| | - Liming Li
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
- Peking University Health Science Center Meinian Public Health Institute, Beijing, China
- Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing, China
- Peking University Center for Public Health and Epidemic Preparedness & Response, Beijing, China
| |
Collapse
|
10
|
Nayebi A, Tipirneni S, Reddy CK, Foreman B, Subbian V. WindowSHAP: An efficient framework for explaining time-series classifiers based on Shapley values. J Biomed Inform 2023; 144:104438. [PMID: 37414368 PMCID: PMC10552726 DOI: 10.1016/j.jbi.2023.104438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 06/29/2023] [Accepted: 07/03/2023] [Indexed: 07/08/2023]
Abstract
Unpacking and comprehending how black-box machine learning algorithms (such as deep learning models) make decisions has been a persistent challenge for researchers and end-users. Explaining time-series predictive models is useful for clinical applications with high stakes to understand the behavior of prediction models, e.g., to determine how different variables and time points influence the clinical outcome. However, existing approaches to explain such models are frequently unique to architectures and data where the features do not have a time-varying component. In this paper, we introduce WindowSHAP, a model-agnostic framework for explaining time-series classifiers using Shapley values. We intend for WindowSHAP to mitigate the computational complexity of calculating Shapley values for long time-series data as well as improve the quality of explanations. WindowSHAP is based on partitioning a sequence into time windows. Under this framework, we present three distinct algorithms of Stationary, Sliding and Dynamic WindowSHAP, each evaluated against baseline approaches, KernelSHAP and TimeSHAP, using perturbation and sequence analyses metrics. We applied our framework to clinical time-series data from both a specialized clinical domain (Traumatic Brain Injury - TBI) as well as a broad clinical domain (critical care medicine). The experimental results demonstrate that, based on the two quantitative metrics, our framework is superior at explaining clinical time-series classifiers, while also reducing the complexity of computations. We show that for time-series data with 120 time steps (hours), merging 10 adjacent time points can reduce the CPU time of WindowSHAP by 80 % compared to KernelSHAP. We also show that our Dynamic WindowSHAP algorithm focuses more on the most important time steps and provides more understandable explanations. As a result, WindowSHAP not only accelerates the calculation of Shapley values for time-series data, but also delivers more understandable explanations with higher quality.
Collapse
Affiliation(s)
- Amin Nayebi
- Department of Systems and Industrial Engineering, University of Arizona, AZ, USA.
| | | | | | | | - Vignesh Subbian
- Department of Systems and Industrial Engineering, University of Arizona, AZ, USA; Department of Biomedical Engineering, University of Arizona, AZ, USA
| |
Collapse
|
11
|
Zou M, An Y, Kuang H, Wang J. LGTRL-DE: Local and Global Temporal Representation Learning with Demographic Embedding for in-hospital mortality prediction. J Biomed Inform 2023:104408. [PMID: 37295630 DOI: 10.1016/j.jbi.2023.104408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 03/28/2023] [Accepted: 05/28/2023] [Indexed: 06/12/2023]
Abstract
Predicting the patient's in-hospital mortality from the historical Electronic Medical Records (EMRs) can assist physicians to make clinical decisions and assign medical resources. In recent years, researchers proposed many deep learning methods to predict in-hospital mortality by learning patient representations. However, most of these methods fail to comprehensively learn the temporal representations and do not sufficiently mine the contextual knowledge of demographic information. We propose a novel end-to-end approach based on Local and Global Temporal Representation Learning with Demographic Embedding (LGTRL-DE) to address the current issues for in-hospital mortality prediction. LGTRL-DE is enabled by (1) a local temporal representation learning module that captures the temporal information and analyzes the health status from a local perspective through a recurrent neural network with the demographic initialization and the local attention mechanism; (2) a Transformer-based global temporal representation learning module that extracts the interaction dependencies among clinical events; (3) a multi-view representation fusion module that fuses temporal and static information and generates the final patient's health representations. We evaluate our proposed LGTRL-DE on two public real-world clinical datasets (MIMIC-III and e-ICU). Experimental results show that LGTRL-DE achieves an area under receiver operating characteristic curve of 0.8685 and 0.8733 on the MIMIC-III and e-ICU datasets, respectively, outperforming state-of-the-art approaches.
Collapse
Affiliation(s)
- Mengjie Zou
- School of Computer Science and Engineering, Central South University, Changsha, 410083, PR China.
| | - Ying An
- The Institute of Big Data, Central South University, Changsha, 410083, PR China.
| | - Hulin Kuang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, PR China.
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, PR China.
| |
Collapse
|
12
|
Moazemi S, Vahdati S, Li J, Kalkhoff S, Castano LJV, Dewitz B, Bibo R, Sabouniaghdam P, Tootooni MS, Bundschuh RA, Lichtenberg A, Aubin H, Schmid F. Artificial intelligence for clinical decision support for monitoring patients in cardiovascular ICUs: A systematic review. Front Med (Lausanne) 2023; 10:1109411. [PMID: 37064042 PMCID: PMC10102653 DOI: 10.3389/fmed.2023.1109411] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Accepted: 03/10/2023] [Indexed: 04/03/2023] Open
Abstract
BackgroundArtificial intelligence (AI) and machine learning (ML) models continue to evolve the clinical decision support systems (CDSS). However, challenges arise when it comes to the integration of AI/ML into clinical scenarios. In this systematic review, we followed the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA), the population, intervention, comparator, outcome, and study design (PICOS), and the medical AI life cycle guidelines to investigate studies and tools which address AI/ML-based approaches towards clinical decision support (CDS) for monitoring cardiovascular patients in intensive care units (ICUs). We further discuss recent advances, pitfalls, and future perspectives towards effective integration of AI into routine practices as were identified and elaborated over an extensive selection process for state-of-the-art manuscripts.MethodsStudies with available English full text from PubMed and Google Scholar in the period from January 2018 to August 2022 were considered. The manuscripts were fetched through a combination of the search keywords including AI, ML, reinforcement learning (RL), deep learning, clinical decision support, and cardiovascular critical care and patients monitoring. The manuscripts were analyzed and filtered based on qualitative and quantitative criteria such as target population, proper study design, cross-validation, and risk of bias.ResultsMore than 100 queries over two medical search engines and subjective literature research were developed which identified 89 studies. After extensive assessments of the studies both technically and medically, 21 studies were selected for the final qualitative assessment.DiscussionClinical time series and electronic health records (EHR) data were the most common input modalities, while methods such as gradient boosting, recurrent neural networks (RNNs) and RL were mostly used for the analysis. Seventy-five percent of the selected papers lacked validation against external datasets highlighting the generalizability issue. Also, interpretability of the AI decisions was identified as a central issue towards effective integration of AI in healthcare.
Collapse
Affiliation(s)
- Sobhan Moazemi
- Digital Health Lab Düsseldorf, Department of Cardiovascular Surgery, Medical Faculty and University Hospital Düsseldorf, Düsseldorf, Germany
- *Correspondence: Sobhan Moazemi,
| | - Sahar Vahdati
- Institute for Applied Informatics (InfAI), Dresden, Germany
| | - Jason Li
- Institute for Applied Informatics (InfAI), Dresden, Germany
| | - Sebastian Kalkhoff
- Digital Health Lab Düsseldorf, Department of Cardiovascular Surgery, Medical Faculty and University Hospital Düsseldorf, Düsseldorf, Germany
| | - Luis J. V. Castano
- Digital Health Lab Düsseldorf, Department of Cardiovascular Surgery, Medical Faculty and University Hospital Düsseldorf, Düsseldorf, Germany
| | - Bastian Dewitz
- Digital Health Lab Düsseldorf, Department of Cardiovascular Surgery, Medical Faculty and University Hospital Düsseldorf, Düsseldorf, Germany
| | - Roman Bibo
- Digital Health Lab Düsseldorf, Department of Cardiovascular Surgery, Medical Faculty and University Hospital Düsseldorf, Düsseldorf, Germany
| | | | - Mohammad S. Tootooni
- Department of Health Informatics and Data Science, Loyola University Chicago, Chicago, IL, United States
| | - Ralph A. Bundschuh
- Nuclear Medicine, Medical Faculty, University Augsburg, Augsburg, Germany
| | - Artur Lichtenberg
- Digital Health Lab Düsseldorf, Department of Cardiovascular Surgery, Medical Faculty and University Hospital Düsseldorf, Düsseldorf, Germany
| | - Hug Aubin
- Digital Health Lab Düsseldorf, Department of Cardiovascular Surgery, Medical Faculty and University Hospital Düsseldorf, Düsseldorf, Germany
| | - Falko Schmid
- Digital Health Lab Düsseldorf, Department of Cardiovascular Surgery, Medical Faculty and University Hospital Düsseldorf, Düsseldorf, Germany
| |
Collapse
|
13
|
Fischer A, Rietveld A, Teunissen P, Bakker P, Hoogendoorn M. End-to-end learning with interpretation on electrohysterography data to predict preterm birth. Comput Biol Med 2023; 158:106846. [PMID: 37019011 DOI: 10.1016/j.compbiomed.2023.106846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 03/03/2023] [Accepted: 03/30/2023] [Indexed: 04/03/2023]
Abstract
Prediction of preterm birth is a difficult task for clinicians. By examining an electrohysterogram, electrical activity of the uterus that can lead to preterm birth can be detected. Since signals associated with uterine activity are difficult to interpret for clinicians without a background in signal processing, machine learning may be a viable solution. We are the first to employ Deep Learning models, a long-short term memory and temporal convolutional network model, on electrohysterography data using the Term-Preterm Electrohysterogram database. We show that end-to-end learning achieves an AUC score of 0.58, which is comparable to machine learning models that use handcrafted features. Moreover, we evaluate the effect of adding clinical data to the model and conclude that adding the available clinical data to electrohysterography data does not result in a gain in performance. Also, we propose an interpretability framework for time series classification that is well-suited to use in case of limited data, as opposed to existing methods that require large amounts of data. Clinicians with extensive work experience as gynaecologist used our framework to provide insights on how to link our results to clinical practice and stress that in order to decrease the number of false positives, a dataset with patients at high risk of preterm birth should be collected. All code is made publicly available.
Collapse
|
14
|
Deep-learning-based prognostic modeling for incident heart failure in patients with diabetes using electronic health records: A retrospective cohort study. PLoS One 2023; 18:e0281878. [PMID: 36809251 PMCID: PMC9943005 DOI: 10.1371/journal.pone.0281878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 02/02/2023] [Indexed: 02/23/2023] Open
Abstract
Patients with type 2 diabetes mellitus (T2DM) have more than twice the risk of developing heart failure (HF) compared to patients without diabetes. The present study is aimed to build an artificial intelligence (AI) prognostic model that takes in account a large and heterogeneous set of clinical factors and investigates the risk of developing HF in diabetic patients. We carried out an electronic health records- (EHR-) based retrospective cohort study that included patients with cardiological clinical evaluation and no previous diagnosis of HF. Information consists of features extracted from clinical and administrative data obtained as part of routine medical care. The primary endpoint was diagnosis of HF (during out-of-hospital clinical examination or hospitalization). We developed two prognostic models using (1) elastic net regularization for Cox proportional hazard model (COX) and (2) a deep neural network survival method (PHNN), in which a neural network was used to represent a non-linear hazard function and explainability strategies are applied to estimate the influence of predictors on the risk function. Over a median follow-up of 65 months, 17.3% of the 10,614 patients developed HF. The PHNN model outperformed COX both in terms of discrimination (c-index 0.768 vs 0.734) and calibration (2-year integrated calibration index 0.008 vs 0.018). The AI approach led to the identification of 20 predictors of different domains (age, body mass index, echocardiographic and electrocardiographic features, laboratory measurements, comorbidities, therapies) whose relationship with the predicted risk correspond to known trends in the clinical practice. Our results suggest that prognostic models for HF in diabetic patients may improve using EHRs in combination with AI techniques for survival analysis, which provide high flexibility and better performance with respect to standard approaches.
Collapse
|
15
|
Nguyen HT, Vasconcellos HD, Keck K, Reis JP, Lewis CE, Sidney S, Lloyd-Jones DM, Schreiner PJ, Guallar E, Wu CO, Lima JA, Ambale-Venkatesh B. Multivariate longitudinal data for survival analysis of cardiovascular event prediction in young adults: insights from a comparative explainable study. BMC Med Res Methodol 2023; 23:23. [PMID: 36698064 PMCID: PMC9878947 DOI: 10.1186/s12874-023-01845-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Accepted: 01/18/2023] [Indexed: 01/26/2023] Open
Abstract
BACKGROUND Multivariate longitudinal data are under-utilized for survival analysis compared to cross-sectional data (CS - data collected once across cohort). Particularly in cardiovascular risk prediction, despite available methods of longitudinal data analysis, the value of longitudinal information has not been established in terms of improved predictive accuracy and clinical applicability. METHODS We investigated the value of longitudinal data over and above the use of cross-sectional data via 6 distinct modeling strategies from statistics, machine learning, and deep learning that incorporate repeated measures for survival analysis of the time-to-cardiovascular event in the Coronary Artery Risk Development in Young Adults (CARDIA) cohort. We then examined and compared the use of model-specific interpretability methods (Random Survival Forest Variable Importance) and model-agnostic methods (SHapley Additive exPlanation (SHAP) and Temporal Importance Model Explanation (TIME)) in cardiovascular risk prediction using the top-performing models. RESULTS In a cohort of 3539 participants, longitudinal information from 35 variables that were repeatedly collected in 6 exam visits over 15 years improved subsequent long-term (17 years after) risk prediction by up to 8.3% in C-index compared to using baseline data (0.78 vs. 0.72), and up to approximately 4% compared to using the last observed CS data (0.75). Time-varying AUC was also higher in models using longitudinal data (0.86-0.87 at 5 years, 0.79-0.81 at 10 years) than using baseline or last observed CS data (0.80-0.86 at 5 years, 0.73-0.77 at 10 years). Comparative model interpretability analysis revealed the impact of longitudinal variables on model prediction on both the individual and global scales among different modeling strategies, as well as identifying the best time windows and best timing within that window for event prediction. The best strategy to incorporate longitudinal data for accuracy was time series massive feature extraction, and the easiest interpretable strategy was trajectory clustering. CONCLUSION Our analysis demonstrates the added value of longitudinal data in predictive accuracy and epidemiological utility in cardiovascular risk survival analysis in young adults via a unified, scalable framework that compares model performance and explainability. The framework can be extended to a larger number of variables and other longitudinal modeling methods. TRIAL REGISTRATION ClinicalTrials.gov Identifier: NCT00005130, Registration Date: 26/05/2000.
Collapse
Affiliation(s)
- Hieu T. Nguyen
- grid.21107.350000 0001 2171 9311Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD USA
| | - Henrique D. Vasconcellos
- grid.21107.350000 0001 2171 9311Department of Cardiology, Johns Hopkins University, Baltimore, MD USA
| | - Kimberley Keck
- grid.21107.350000 0001 2171 9311Department of Cardiology, Johns Hopkins University, Baltimore, MD USA
| | - Jared P. Reis
- grid.279885.90000 0001 2293 4638National Heart, Lung, and Blood Institute, Bethesda, MD USA
| | - Cora E. Lewis
- grid.265892.20000000106344187Department of Epidemiology, School of Public Health, University of Alabama at Birmingham, Birmingham, AL USA
| | - Steven Sidney
- grid.280062.e0000 0000 9957 7758Division of Research, Kaiser Permanente, Oakland, CA USA
| | - Donald M. Lloyd-Jones
- grid.16753.360000 0001 2299 3507Department of Preventive Medicine, Northwestern University, Chicago, IL USA
| | - Pamela J. Schreiner
- grid.17635.360000000419368657School of Public Health, University of Minnesota, Minneapolis, MN USA
| | - Eliseo Guallar
- grid.21107.350000 0001 2171 9311Department of Epidemiology, Johns Hopkins University School of Public Health, Baltimore, MD USA
| | - Colin O. Wu
- grid.279885.90000 0001 2293 4638National Heart, Lung, and Blood Institute, Bethesda, MD USA
| | - João A.C. Lima
- grid.21107.350000 0001 2171 9311Department of Cardiology, Johns Hopkins University, Baltimore, MD USA ,grid.21107.350000 0001 2171 9311Department of Radiology, Johns Hopkins University, Baltimore, MD USA
| | - Bharath Ambale-Venkatesh
- grid.21107.350000 0001 2171 9311Department of Radiology, Johns Hopkins University, Baltimore, MD USA
| |
Collapse
|
16
|
Early Prediction in Classification of Cardiovascular Diseases with Machine Learning, Neuro-Fuzzy and Statistical Methods. BIOLOGY 2023; 12:biology12010117. [PMID: 36671809 PMCID: PMC9855428 DOI: 10.3390/biology12010117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/12/2022] [Revised: 01/06/2023] [Accepted: 01/08/2023] [Indexed: 01/15/2023]
Abstract
Timely and accurate detection of cardiovascular diseases (CVDs) is critically important to minimize the risk of a myocardial infarction. Relations between factors of CVDs are complex, ill-defined and nonlinear, justifying the use of artificial intelligence tools. These tools aid in predicting and classifying CVDs. In this article, we propose a methodology using machine learning (ML) approaches to predict, classify and improve the diagnostic accuracy of CVDs, including support vector regression (SVR), multivariate adaptive regression splines, the M5Tree model and neural networks for the training process. Moreover, adaptive neuro-fuzzy and statistical approaches, nearest neighbor/naive Bayes classifiers and adaptive neuro-fuzzy inference system (ANFIS) are used to predict seventeen CVD risk factors. Mixed-data transformation and classification methods are employed for categorical and continuous variables predicting CVD risk. We compare our hybrid models and existing ML techniques on a CVD real dataset collected from a hospital. A sensitivity analysis is performed to determine the influence and exhibit the essential variables with regard to CVDs, such as the patient's age, cholesterol level and glucose level. Our results report that the proposed methodology outperformed well known statistical and ML approaches, showing their versatility and utility in CVD classification. Our investigation indicates that the prediction accuracy of ANFIS for the training process is 96.56%, followed by SVR with 91.95% prediction accuracy. Our study includes a comprehensive comparison of results obtained for the mentioned methods.
Collapse
|
17
|
Chen Q, Li R, Lin C, Lai C, Chen D, Qu H, Huang Y, Lu W, Tang Y, Li L. Transferability and interpretability of the sepsis prediction models in the intensive care unit. BMC Med Inform Decis Mak 2022; 22:343. [PMID: 36581881 PMCID: PMC9798724 DOI: 10.1186/s12911-022-02090-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Accepted: 12/16/2022] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND We aimed to develop an early warning system for real-time sepsis prediction in the ICU by machine learning methods, with tools for interpretative analysis of the predictions. In particular, we focus on the deployment of the system in a target medical center with small historical samples. METHODS Light Gradient Boosting Machine (LightGBM) and multilayer perceptron (MLP) were trained on Medical Information Mart for Intensive Care (MIMIC-III) dataset and then finetuned on the private Historical Database of local Ruijin Hospital (HDRJH) using transfer learning technique. The Shapley Additive Explanations (SHAP) analysis was employed to characterize the feature importance in the prediction inference. Ultimately, the performance of the sepsis prediction system was further evaluated in the real-world study in the ICU of the target Ruijin Hospital. RESULTS The datasets comprised 6891 patients from MIMIC-III, 453 from HDRJH, and 67 from Ruijin real-world data. The area under the receiver operating characteristic curves (AUCs) for LightGBM and MLP models derived from MIMIC-III were 0.98 - 0.98 and 0.95 - 0.96 respectively on MIMIC-III dataset, and, in comparison, 0.82 - 0.86 and 0.84 - 0.87 respectively on HDRJH, from 1 to 5 h preceding. After transfer learning and ensemble learning, the AUCs of the final ensemble model were enhanced to 0.94 - 0.94 on HDRJH and to 0.86 - 0.9 in the real-world study in the ICU of the target Ruijin Hospital. In addition, the SHAP analysis illustrated the importance of age, antibiotics, net balance, and ventilation for sepsis prediction, making the model interpretable. CONCLUSIONS Our machine learning model allows accurate real-time prediction of sepsis within 5-h preceding. Transfer learning can effectively improve the feasibility to deploy the prediction model in the target cohort, and ameliorate the model performance for external validation. SHAP analysis indicates that the role of antibiotic usage and fluid management needs further investigation. We argue that our system and methodology have the potential to improve ICU management by helping medical practitioners identify at-sepsis-risk patients and prepare for timely diagnosis and intervention. TRIAL REGISTRATION NCT05088850 (retrospectively registered).
Collapse
Affiliation(s)
- Qiyu Chen
- grid.8547.e0000 0001 0125 2443Department of Applied Mathematics, School of Mathematical Sciences, Fudan University, Shanghai, 200433 China
| | - Ranran Li
- grid.16821.3c0000 0004 0368 8293Department of Critical Care Medicine, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, 200025 China
| | - ChihChe Lin
- grid.495525.a0000 0004 0552 4356Shanghai Electric Group Co., Ltd., Central Academe, Shanghai, China
| | - Chiming Lai
- grid.495525.a0000 0004 0552 4356Shanghai Electric Group Co., Ltd., Central Academe, Shanghai, China
| | - Dechang Chen
- grid.16821.3c0000 0004 0368 8293Department of Critical Care Medicine, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, 200025 China
| | - Hongping Qu
- grid.16821.3c0000 0004 0368 8293Department of Critical Care Medicine, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, 200025 China
| | - Yaling Huang
- grid.495525.a0000 0004 0552 4356Shanghai Electric Group Co., Ltd., Central Academe, Shanghai, China
| | - Wenlian Lu
- grid.8547.e0000 0001 0125 2443Department of Applied Mathematics, School of Mathematical Sciences, Fudan University, Shanghai, 200433 China
| | - Yaoqing Tang
- grid.16821.3c0000 0004 0368 8293Department of Critical Care Medicine, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, 200025 China
| | - Lei Li
- grid.16821.3c0000 0004 0368 8293Department of Critical Care Medicine, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, 200025 China
| |
Collapse
|
18
|
Multilayer dynamic ensemble model for intensive care unit mortality prediction of neonate patients. J Biomed Inform 2022; 135:104216. [DOI: 10.1016/j.jbi.2022.104216] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Revised: 09/25/2022] [Accepted: 09/28/2022] [Indexed: 12/26/2022]
|
19
|
Coombes CE, Coombes KR, Fareed N. Sequences of Events from the Electronic Medical Record and the Onset of Infection. Chem Biodivers 2022; 19:e202200657. [PMID: 36216587 DOI: 10.1002/cbdv.202200657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Accepted: 09/15/2022] [Indexed: 11/06/2022]
Abstract
We present a novel model of time-series analysis to learn from electronic health record (EHR) data when infection occurred in the intensive care unit (ICU) by translating methods from proteomics and Bayesian statistics. Using 48,536 patients hospitalized in an ICU, we describe each hospital course as an 'alphabet' of 23 physician actions ('events') in temporal order. We analyze these as k-mers of length 3-12 events and apply a Bayesian model of (cumulative) relative risk (RR). The log2-transformed RR (median=0.248, mean=0.226) supported the conclusion that the events selected were individually associated with increased risk of infection. Selecting from all possible cutoffs of maximum gain (MG), MG>0.0244 predicts administration of antibiotics with PPV 82.0 %, NPV 44.4 %, and AUC 0.706. Our approach holds value for retrospective analysis of other clinical syndromes for which time-of-onset is critical to analysis but poorly marked in EHRs, including delirium and decompensation.
Collapse
Affiliation(s)
- Caitlin E Coombes
- Department of Anesthesiology, Stanford University, 300 Pasteur Dr., Palo Alto, CA 94305, USA
| | - Kevin R Coombes
- Department of Population Health Sciences, Medical College of Georgia, 1420 Laney Walker Blvd, Augusta, GA 30912, USA
| | - Naleef Fareed
- Department of Biomedical Informatics, The Ohio State University College of Medicine, 370 W 9th Ave, Columbus, OH 43210, USA
| |
Collapse
|
20
|
Deng Y, Liu S, Wang Z, Wang Y, Jiang Y, Liu B. Explainable time-series deep learning models for the prediction of mortality, prolonged length of stay and 30-day readmission in intensive care patients. Front Med (Lausanne) 2022; 9:933037. [PMID: 36250092 PMCID: PMC9554013 DOI: 10.3389/fmed.2022.933037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Accepted: 09/01/2022] [Indexed: 11/14/2022] Open
Abstract
Background In-hospital mortality, prolonged length of stay (LOS), and 30-day readmission are common outcomes in the intensive care unit (ICU). Traditional scoring systems and machine learning models for predicting these outcomes usually ignore the characteristics of ICU data, which are time-series forms. We aimed to use time-series deep learning models with the selective combination of three widely used scoring systems to predict these outcomes. Materials and methods A retrospective cohort study was conducted on 40,083 patients in ICU from the Medical Information Mart for Intensive Care-IV (MIMIC-IV) database. Three deep learning models, namely, recurrent neural network (RNN), gated recurrent unit (GRU), and long short-term memory (LSTM) with attention mechanisms, were trained for the prediction of in-hospital mortality, prolonged LOS, and 30-day readmission with variables collected during the initial 24 h after ICU admission or the last 24 h before discharge. The inclusion of variables was based on three widely used scoring systems, namely, APACHE II, SOFA, and SAPS II, and the predictors consisted of time-series vital signs, laboratory tests, medication, and procedures. The patients were randomly divided into a training set (80%) and a test set (20%), which were used for model development and model evaluation, respectively. The area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and Brier scores were used to evaluate model performance. Variable significance was identified through attention mechanisms. Results A total of 33 variables for 40,083 patients were enrolled for mortality and prolonged LOS prediction and 36,180 for readmission prediction. The rates of occurrence of the three outcomes were 9.74%, 27.54%, and 11.79%, respectively. In each of the three outcomes, the performance of RNN, GRU, and LSTM did not differ greatly. Mortality prediction models, prolonged LOS prediction models, and readmission prediction models achieved AUCs of 0.870 ± 0.001, 0.765 ± 0.003, and 0.635 ± 0.018, respectively. The top significant variables co-selected by the three deep learning models were Glasgow Coma Scale (GCS), age, blood urea nitrogen, and norepinephrine for mortality; GCS, invasive ventilation, and blood urea nitrogen for prolonged LOS; and blood urea nitrogen, GCS, and ethnicity for readmission. Conclusion The prognostic prediction models established in our study achieved good performance in predicting common outcomes of patients in ICU, especially in mortality prediction. In addition, GCS and blood urea nitrogen were identified as the most important factors strongly associated with adverse ICU events.
Collapse
Affiliation(s)
- Yuhan Deng
- School of Public Health, Peking University, Beijing, China
| | - Shuang Liu
- School of Public Health, Peking University, Beijing, China
| | - Ziyao Wang
- School of Public Health, Peking University, Beijing, China
| | - Yuxin Wang
- School of Public Health, Peking University, Beijing, China
| | - Yong Jiang
- Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
- China National Clinical Research Center for Neurological Diseases, Beijing, China
- Yong Jiang,
| | - Baohua Liu
- School of Public Health, Peking University, Beijing, China
- *Correspondence: Baohua Liu,
| |
Collapse
|
21
|
邓 宇, 姜 勇, 王 子, 刘 爽, 汪 雨, 刘 宝. [Long short-term memory and Logistic regression for mortality risk prediction of intensive care unit patients with stroke]. BEIJING DA XUE XUE BAO. YI XUE BAN = JOURNAL OF PEKING UNIVERSITY. HEALTH SCIENCES 2022; 54:458-467. [PMID: 35701122 PMCID: PMC9197695 DOI: 10.19723/j.issn.1671-167x.2022.03.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Indexed: 06/15/2023]
Abstract
OBJECTIVE To select variables related to mortality risk of stroke patients in intensive care unit (ICU) through long short-term memory (LSTM) with attention mechanisms and Logistic regression with L1 norm, and to construct mortality risk prediction model based on conventional Logistic regression with important variables selected from the two models and to evaluate the model performance. METHODS Medical Information Mart for Intensive Care (MIMIC)-Ⅳ database was retrospectively analyzed and the patients who were primarily diagnosed with stroke were selected as study population. The outcome was defined as whether the patient died in hospital after admission. Candidate predictors included demogra-phic information, complications, laboratory tests and vital signs in the initial 48 h after ICU admission. The data were randomly divided into a training set and a test set for ten times at a ratio of 8 ∶2. In training sets, LSTM with attention mechanisms and Logistic regression with L1 norm were constructed to select important variables. In the test sets, the mean importance of variables of ten times was used as a reference to pick out the top 10 variables in each of the two models, and then these variables were included in conventional Logistic regression to build the final prediction model. Model evaluation was based on the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and accuracy. And the model performance was compared with the forward Logistic regression model which hadn't conducted variable selection previously. RESULTS A total of 2 755 patients with 2 979 ICU admission records were included in the analysis, of which 526 recorded deaths. The AUC of Logistic regression model with L1 norm was statistically better than that of LSTM with attention mechanisms (0.819±0.031 vs. 0.760±0.018, P < 0.001). Age, blood glucose, and blood urea nitrogen were at the top ten important variables in both of the two models. AUC, sensitivity, specificity, and accuracy of Logistic regression models were 0.85, 85.98%, 71.74% and 74.26%, respectively. And the final prediction model was superior to forward Logistic regression model. CONCLUSION The variables selected by Logistic regression with L1 norm and LSTM with attention mechanisms had good prediction performance, which showed important implications on the mortality prediction of stroke patients in ICU.
Collapse
Affiliation(s)
- 宇含 邓
- 北京大学公共卫生学院社会医学与健康教育学系,北京 100191Department of Social Medicine and Health Education, Peking University School of Public Health, Beijing 100191, China
| | - 勇 姜
- 国家神经系统疾病临床医学研究中心,首都医科大学附属北京天坛医院神经病学中心,北京 100050China National Clinical Research Center for Neurological Diseases, Department of Neurology, Beijing Tian Tan Hospital, Capital Medical University, Beijing 100050, China
- 北京大数据精准医疗高精尖创新中心(北京航空航天大学&首都医科大学),北京 100070Beijing Advanced Innovation Center for Big Data-Based Precision Medicine (Beihang University & Capital Medical University), Beijing 100070, China
| | - 子尧 王
- 北京大学公共卫生学院社会医学与健康教育学系,北京 100191Department of Social Medicine and Health Education, Peking University School of Public Health, Beijing 100191, China
| | - 爽 刘
- 北京大学公共卫生学院社会医学与健康教育学系,北京 100191Department of Social Medicine and Health Education, Peking University School of Public Health, Beijing 100191, China
| | - 雨欣 汪
- 北京大学公共卫生学院社会医学与健康教育学系,北京 100191Department of Social Medicine and Health Education, Peking University School of Public Health, Beijing 100191, China
| | - 宝花 刘
- 北京大学公共卫生学院社会医学与健康教育学系,北京 100191Department of Social Medicine and Health Education, Peking University School of Public Health, Beijing 100191, China
| |
Collapse
|