Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Wu S, Liu S, Sohn S, Moon S, Wi CI, Juhn Y, Liu H. Modeling asynchronous event sequences with RNNs. J Biomed Inform 2018;83:167-177. [PMID: 29883623 PMCID: PMC6103779 DOI: 10.1016/j.jbi.2018.05.016] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2017] [Revised: 05/10/2018] [Accepted: 05/26/2018] [Indexed: 12/14/2022]

For:	Wu S, Liu S, Sohn S, Moon S, Wi CI, Juhn Y, Liu H. Modeling asynchronous event sequences with RNNs. J Biomed Inform 2018;83:167-177. [PMID: 29883623 PMCID: PMC6103779 DOI: 10.1016/j.jbi.2018.05.016] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2017] [Revised: 05/10/2018] [Accepted: 05/26/2018] [Indexed: 12/14/2022]

Number

Cited by Other Article(s)

Varošanec AM, Marković L, Sonicki Z. A Novel Time-Aware Deep Learning Model Predicting Myopia in Children and Adolescents. OPHTHALMOLOGY SCIENCE 2024;4:100563. [PMID: 39165695 PMCID: PMC11334700 DOI: 10.1016/j.xops.2024.100563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/25/2024] [Revised: 05/26/2024] [Accepted: 06/05/2024] [Indexed: 08/22/2024]

Abstract

Objective

To quantitatively predict children's and adolescents' spherical equivalent (SE) by leveraging their variable-length historical vision records.

Design

Retrospective analysis.

Participants

Eight hundred ninety-five myopic children and adolescents aged 4 to 18 years, with a complete ophthalmic examination and retinoscopy in cycloplegia prior to spectacle correction, were enrolled in the period from January 1, 2008 to July 1, 2023 at the University Hospital "Sveti Duh," Zagreb, Croatia.

Methods

A novel modification of time-aware long short-term memory (LSTM) was used to quantitatively predict children's and adolescents' SE within 7 years after diagnosis.

Main Outcome Measures

The utilization of extended gate time-aware LSTM involved capturing temporal features within irregularly sampled time series data. This approach aligned more closely with the characteristics of fact-based data, increasing its applicability and contributing to the early identification of myopia progression.

Results

The testing set exhibited a mean absolute prediction error (MAE) of 0.10 ± 0.15 diopter (D) for SE. Lower MAE values were associated with longer sequence lengths, shorter prediction durations, older age groups, and low myopia, while higher MAE values were observed with shorter sequence lengths, longer prediction durations, younger age groups, and in premyopic or high myopic individuals, ranging from as low as 0.03 ± 0.04 D to as high as 0.45 ± 0.24 D.

Conclusions

Extended gate time-aware LSTM capturing temporal features in irregularly sampled time series data can be used to quantitatively predict children's and adolescents' SE within 7 years with an overall error of 0.10 ± 0.15 D. This value is substantially lower than the threshold for prediction to be considered clinically acceptable, such as a criterion of 0.75 D.

Financial Disclosures

The author(s) have no proprietary or commercial interest in any materials discussed in this article.

Collapse

Nigo M, Rasmy L, Mao B, Kannadath BS, Xie Z, Zhi D. Deep learning model for personalized prediction of positive MRSA culture using time-series electronic health records. Nat Commun 2024;15:2036. [PMID: 38448409 PMCID: PMC10917736 DOI: 10.1038/s41467-024-46211-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 02/19/2024] [Indexed: 03/08/2024] Open

Budiarto A, Tsang KCH, Wilson AM, Sheikh A, Shah SA. Machine Learning-Based Asthma Attack Prediction Models From Routinely Collected Electronic Health Records: Systematic Scoping Review. JMIR AI 2023;2:e46717. [PMID: 38875586 PMCID: PMC11041490 DOI: 10.2196/46717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 09/28/2023] [Accepted: 10/09/2023] [Indexed: 06/16/2024]

Abstract

BACKGROUND

An early warning tool to predict attacks could enhance asthma management and reduce the likelihood of serious consequences. Electronic health records (EHRs) providing access to historical data about patients with asthma coupled with machine learning (ML) provide an opportunity to develop such a tool. Several studies have developed ML-based tools to predict asthma attacks.

OBJECTIVE

This study aims to critically evaluate ML-based models derived using EHRs for the prediction of asthma attacks.

METHODS

We systematically searched PubMed and Scopus (the search period was between January 1, 2012, and January 31, 2023) for papers meeting the following inclusion criteria: (1) used EHR data as the main data source, (2) used asthma attack as the outcome, and (3) compared ML-based prediction models' performance. We excluded non-English papers and nonresearch papers, such as commentary and systematic review papers. In addition, we also excluded papers that did not provide any details about the respective ML approach and its result, including protocol papers. The selected studies were then summarized across multiple dimensions including data preprocessing methods, ML algorithms, model validation, model explainability, and model implementation.

RESULTS

Overall, 17 papers were included at the end of the selection process. There was considerable heterogeneity in how asthma attacks were defined. Of the 17 studies, 8 (47%) studies used routinely collected data both from primary care and secondary care practices together. Extreme imbalanced data was a notable issue in most studies (13/17, 76%), but only 38% (5/13) of them explicitly dealt with it in their data preprocessing pipeline. The gradient boosting-based method was the best ML method in 59% (10/17) of the studies. Of the 17 studies, 14 (82%) studies used a model explanation method to identify the most important predictors. None of the studies followed the standard reporting guidelines, and none were prospectively validated.

CONCLUSIONS

Our review indicates that this research field is still underdeveloped, given the limited body of evidence, heterogeneity of methods, lack of external validation, and suboptimally reported models. We highlighted several technical challenges (class imbalance, external validation, model explanation, and adherence to reporting guidelines to aid reproducibility) that need to be addressed to make progress toward clinical adoption.

Collapse

Pungitore S, Subbian V. Assessment of Prediction Tasks and Time Window Selection in Temporal Modeling of Electronic Health Record Data: a Systematic Review. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2023;7:313-331. [PMID: 37637723 PMCID: PMC10449760 DOI: 10.1007/s41666-023-00143-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 04/12/2023] [Accepted: 07/28/2023] [Indexed: 08/29/2023]

Abstract

Temporal electronic health record (EHR) data are often preferred for clinical prediction tasks because they offer more complete representations of a patient's pathophysiology than static data. A challenge when working with temporal EHR data is problem formulation, which includes defining the time windows of interest and the prediction task. Our objective was to conduct a systematic review that assessed the definition and reporting of concepts relevant to temporal clinical prediction tasks. We searched PubMed® and IEEE Xplore® databases for studies from January 1, 2010 applying machine learning models to EHR data for patient outcome prediction. Publications applying time-series methods were selected for further review. We identified 92 studies and summarized them by clinical context and definition and reporting of the prediction problem. For the time windows of interest, 12 studies did not discuss window lengths, 57 used a single set of window lengths, and 23 evaluated the relationship between window length and model performance. We also found that 72 studies had appropriate reporting of the prediction task. However, evaluation of prediction problem formulation for temporal EHR data was complicated by heterogeneity in assessing and reporting of these concepts. Even among studies modeling similar clinical outcomes, there were variations in terminology used to describe the prediction problem, rationale for window lengths, and determination of the outcome of interest. As temporal modeling using EHR data expands, minimal reporting standards should include time-series specific concerns to promote rigor and reproducibility in future studies and facilitate model implementation in clinical settings.

Supplementary Information

The online version contains supplementary material available at 10.1007/s41666-023-00143-4.

Collapse

Budiarto A, Sheikh A, Wilson A, Price DB, Shah SA. Handling Class Imbalance in Machine Learning-based Prediction Models: A Case Study in Asthma Management. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023;2023:1-5. [PMID: 38083129 DOI: 10.1109/embc40787.2023.10340751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]

Abstract

A data-driven prediction tool has the potential to provide early warning of an asthma attack and improve asthma management and outcomes. Most previous machine learning (ML)-based studies for asthma attack prediction have reported a severe class imbalance, with major implications for model performance. We aimed to undertake a systematic comparison of several class imbalance handling techniques in the context of risk prediction models for asthma prognosis. We used data from 9,835 asthma patients extracted from the Medical Information Mart for Intensive Care (MIMIC) IV database and deployed five class imbalance handling methods based on synthetic minority oversampling technique (SMOTE) and cost function customisation. We then compared their performances in improving two-class classifier models developed using logistic regression (LR) and extreme gradient boosting (XGBoost) for three different prediction tasks with varying severity of class imbalance (proportion of majority class ranging from 90.86% to 98.98%). The cost function customisation technique substantially outperformed the SMOTE-based methods in all tasks. XGBoost combined with cost function customisation achieved the highest prediction performance for the outcome with the most extreme class imbalance ratio (AUC = 0.72). Our findings suggest that the cost function customisation-based approach to tackle class imbalance provides substantially better performance compared to oversampling in the context of asthma management.Clinical Relevance- This study underscores the challenge of class imbalance in the context of prediction tools to improve asthma management and outcomes and provides a methodological solution that addresses the challenge. Accurate asthma prediction tools can provide early warning and potentially prevent deterioration thereby improving the quality of life of patients with asthma.

Collapse

Luo J, Lan L, Huang S, Zeng X, Xiang Q, Li M, Yang S, Zhao W, Zhou X. Real-time prediction of organ failures in patients with acute pancreatitis using longitudinal irregular data. J Biomed Inform 2023;139:104310. [PMID: 36773821 DOI: 10.1016/j.jbi.2023.104310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Revised: 01/10/2023] [Accepted: 02/06/2023] [Indexed: 02/12/2023]

Liu LJ, Ortiz-Soriano V, Neyra JA, Chen J. KIT-LSTM: Knowledge-guided Time-aware LSTM for Continuous Clinical Risk Prediction. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE 2022;2022:1086-1091. [PMID: 37131483 PMCID: PMC10151119 DOI: 10.1109/bibm55620.2022.9994931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Silva JF, Matos S. Modelling patient trajectories using multimodal information. J Biomed Inform 2022;134:104195. [PMID: 36150641 DOI: 10.1016/j.jbi.2022.104195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 07/16/2022] [Accepted: 08/30/2022] [Indexed: 11/26/2022]

Abstract

BACKGROUND

Electronic Health Records (EHRs) aggregate diverse information at the patient level, holding a trajectory representative of the evolution of the patient health status throughout time. Although this information provides context and can be leveraged by physicians to monitor patient health and make more accurate prognoses/diagnoses, patient records can contain information from very long time spans, which combined with the rapid generation rate of medical data makes clinical decision making more complex. Patient trajectory modelling can assist by exploring existing information in a scalable manner, and can contribute in augmenting health care quality by fostering preventive medicine practices (e.g. earlier disease diagnosis).

METHODS

We propose a solution to model patient trajectories that combines different types of information (e.g. clinical text, standard codes) and considers the temporal aspect of clinical data. This solution leverages two different architectures: one supporting flexible sets of input features, to convert patient admissions into dense representations; and a second exploring extracted admission representations in a recurrent-based architecture, where patient trajectories are processed in sub-sequences using a sliding window mechanism.

RESULTS

The developed solution was evaluated on two different clinical outcomes, unexpected patient readmission and disease progression, using the publicly available Medical Information Mart for Intensive Care (MIMIC)-III clinical database. The results obtained demonstrate the potential of the first architecture to model readmission and diagnoses prediction using single patient admissions. While information from clinical text did not show the discriminative power observed in other existing works, this may be explained by the need to fine-tune the clinicalBERT model. Finally, we demonstrate the potential of the sequence-based architecture using a sliding window mechanism to represent the input data, attaining comparable performances to other existing solutions.

CONCLUSION

Herein, we explored DL-based techniques to model patient trajectories and propose two flexible architectures that explore patient admissions on an individual and sequence basis. The combination of clinical text with other types of information led to positive results, which can be further improved by including a fine-tuned version of clinicalBERT in the architectures. The proposed solution can be publicly accessed at https://github.com/bioinformatics-ua/PatientTM.

Collapse

Ghazi MM, Sorensen L, Ourselin S, Nielsen M. CARRNN: A Continuous Autoregressive Recurrent Neural Network for Deep Representation Learning From Sporadic Temporal Data. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022;PP:792-802. [PMID: 35666790 DOI: 10.1109/tnnls.2022.3177366] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Rasmy L, Nigo M, Kannadath BS, Xie Z, Mao B, Patel K, Zhou Y, Zhang W, Ross A, Xu H, Zhi D. Recurrent neural network models (CovRNN) for predicting outcomes of patients with COVID-19 on admission to hospital: model development and validation using electronic health record data. Lancet Digit Health 2022;4:e415-e425. [PMID: 35466079 PMCID: PMC9023005 DOI: 10.1016/s2589-7500(22)00049-8] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 01/11/2022] [Accepted: 03/07/2022] [Indexed: 02/08/2023]

Abstract

BACKGROUND

Predicting outcomes of patients with COVID-19 at an early stage is crucial for optimised clinical care and resource management, especially during a pandemic. Although multiple machine learning models have been proposed to address this issue, because of their requirements for extensive data preprocessing and feature engineering, they have not been validated or implemented outside of their original study site. Therefore, we aimed to develop accurate and transferrable predictive models of outcomes on hospital admission for patients with COVID-19.

METHODS

In this study, we developed recurrent neural network-based models (CovRNN) to predict the outcomes of patients with COVID-19 by use of available electronic health record data on admission to hospital, without the need for specific feature selection or missing data imputation. CovRNN was designed to predict three outcomes: in-hospital mortality, need for mechanical ventilation, and prolonged hospital stay (>7 days). For in-hospital mortality and mechanical ventilation, CovRNN produced time-to-event risk scores (survival prediction; evaluated by the concordance index) and all-time risk scores (binary prediction; area under the receiver operating characteristic curve [AUROC] was the main metric); we only trained a binary classification model for prolonged hospital stay. For binary classification tasks, we compared CovRNN against traditional machine learning algorithms: logistic regression and light gradient boost machine. Our models were trained and validated on the heterogeneous, deidentified data of 247 960 patients with COVID-19 from 87 US health-care systems derived from the Cerner Real-World COVID-19 Q3 Dataset up to September 2020. We held out the data of 4175 patients from two hospitals for external validation. The remaining 243 785 patients from the 85 health systems were grouped into training (n=170 626), validation (n=24 378), and multi-hospital test (n=48 781) sets. Model performance was evaluated in the multi-hospital test set. The transferability of CovRNN was externally validated by use of deidentified data from 36 140 patients derived from the US-based Optum deidentified COVID-19 electronic health record dataset (version 1015; from January, 2007, to Oct 15, 2020). Exact dates of data extraction were masked by the databases to ensure patient data safety.

FINDINGS

CovRNN binary models achieved AUROCs of 93·0% (95% CI 92·6-93·4) for the prediction of in-hospital mortality, 92·9% (92·6-93·2) for the prediction of mechanical ventilation, and 86·5% (86·2-86·9) for the prediction of a prolonged hospital stay, outperforming light gradient boost machine and logistic regression algorithms. External validation confirmed AUROCs in similar ranges (91·3-97·0% for in-hospital mortality prediction, 91·5-96·0% for the prediction of mechanical ventilation, and 81·0-88·3% for the prediction of prolonged hospital stay). For survival prediction, CovRNN achieved a concordance index of 86·0% (95% CI 85·1-86·9) for in-hospital mortality and 92·6% (92·2-93·0) for mechanical ventilation.

INTERPRETATION

Trained on a large, heterogeneous, real-world dataset, our CovRNN models showed high prediction accuracy and transferability through consistently good performances on multiple external datasets. Our results show the feasibility of a COVID-19 predictive model that delivers high accuracy without the need for complex feature engineering.

FUNDING

Cancer Prevention and Research Institute of Texas.

Collapse

Xie F, Yuan H, Ning Y, Ong MEH, Feng M, Hsu W, Chakraborty B, Liu N. Deep learning for temporal data representation in electronic health records: A systematic review of challenges and methodologies. J Biomed Inform 2021;126:103980. [PMID: 34974189 DOI: 10.1016/j.jbi.2021.103980] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 11/07/2021] [Accepted: 12/20/2021] [Indexed: 12/21/2022]

Yang YC, Islam SU, Noor A, Khan S, Afsar W, Nazir S. Influential Usage of Big Data and Artificial Intelligence in Healthcare. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2021;2021:5812499. [PMID: 34527076 PMCID: PMC8437645 DOI: 10.1155/2021/5812499] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 08/09/2021] [Indexed: 01/07/2023]

Mitra R, MacLean AL. RVAgene: Generative modeling of gene expression time series data. Bioinformatics 2021;37:3252-3262. [PMID: 33974008 PMCID: PMC8504625 DOI: 10.1093/bioinformatics/btab260] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Revised: 04/19/2021] [Accepted: 04/22/2021] [Indexed: 12/04/2022] Open

Ferté T, Cossin S, Schaeverbeke T, Barnetche T, Jouhet V, Hejblum BP. Automatic phenotyping of electronical health record: PheVis algorithm. J Biomed Inform 2021;117:103746. [PMID: 33746080 DOI: 10.1016/j.jbi.2021.103746] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 03/02/2021] [Accepted: 03/05/2021] [Indexed: 11/18/2022]

Sisk R, Lin L, Sperrin M, Barrett JK, Tom B, Diaz-Ordaz K, Peek N, Martin GP. Informative presence and observation in routine health data: A review of methodology for clinical risk prediction. J Am Med Inform Assoc 2021;28:155-166. [PMID: 33164082 PMCID: PMC7810439 DOI: 10.1093/jamia/ocaa242] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Accepted: 09/17/2020] [Indexed: 12/20/2022] Open

Abstract

Objective

Informative presence (IP) is the phenomenon whereby the presence or absence of patient data is potentially informative with respect to their health condition, with informative observation (IO) being the longitudinal equivalent. These phenomena predominantly exist within routinely collected healthcare data, in which data collection is driven by the clinical requirements of patients and clinicians. The extent to which IP and IO are considered when using such data to develop clinical prediction models (CPMs) is unknown, as is the existing methodology aiming at handling these issues. This review aims to synthesize such existing methodology, thereby helping identify an agenda for future methodological work.

Materials and Methods

A systematic literature search was conducted by 2 independent reviewers using prespecified keywords.

Results

Thirty-six articles were included. We categorized the methods presented within as derived predictors (including some representation of the measurement process as a predictor in the model), modeling under IP, and latent structures. Including missing indicators or summary measures as predictors is the most commonly presented approach amongst the included studies (24 of 36 articles).

Discussion

This is the first review to collate the literature in this area under a prediction framework. A considerable body relevant of literature exists, and we present ways in which the described methods could be developed further. Guidance is required for specifying the conditions under which each method should be used to enable applied prediction modelers to use these methods.

Conclusions

A growing recognition of IP and IO exists within the literature, and methodology is increasingly becoming available to leverage these phenomena for prediction purposes. IP and IO should be approached differently in a prediction context than when the primary goal is explanation. The work included in this review has demonstrated theoretical and empirical benefits of incorporating IP and IO, and therefore we recommend that applied health researchers consider incorporating these methods in their work.

Collapse

Xiang Y, Ji H, Zhou Y, Li F, Du J, Rasmy L, Wu S, Zheng WJ, Xu H, Zhi D, Zhang Y, Tao C. Asthma Exacerbation Prediction and Risk Factor Analysis Based on a Time-Sensitive, Attentive Neural Network: Retrospective Cohort Study. J Med Internet Res 2020;22:e16981. [PMID: 32735224 PMCID: PMC7428917 DOI: 10.2196/16981] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2019] [Revised: 03/02/2020] [Accepted: 05/13/2020] [Indexed: 12/13/2022] Open

Estiri H, Strasser ZH, Klann JG, McCoy TH, Wagholikar KB, Vasey S, Castro VM, Murphy ME, Murphy SN. Transitive Sequencing Medical Records for Mining Predictive and Interpretable Temporal Representations. PATTERNS (NEW YORK, N.Y.) 2020;1:100051. [PMID: 32835307 PMCID: PMC7301790 DOI: 10.1016/j.patter.2020.100051] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 04/27/2020] [Accepted: 05/26/2020] [Indexed: 12/13/2022]

Affiliation(s)

Hossein Estiri Laboratory of Computer Science, Massachusetts General Hospital, Boston, MA 02144, USA Research Information Science and Computing, Mass General Brigham, Somerville, MA 02145, USA Harvard Medical School, Boston, MA 02115, USA
Zachary H. Strasser Laboratory of Computer Science, Massachusetts General Hospital, Boston, MA 02144, USA Research Information Science and Computing, Mass General Brigham, Somerville, MA 02145, USA Harvard Medical School, Boston, MA 02115, USA Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
Jeffery G. Klann Laboratory of Computer Science, Massachusetts General Hospital, Boston, MA 02144, USA Research Information Science and Computing, Mass General Brigham, Somerville, MA 02145, USA Harvard Medical School, Boston, MA 02115, USA
Thomas H. McCoy Harvard Medical School, Boston, MA 02115, USA Center for Quantitative Health, Massachusetts General Hospital, Boston, MA 02114, USA
Kavishwar B. Wagholikar Laboratory of Computer Science, Massachusetts General Hospital, Boston, MA 02144, USA Research Information Science and Computing, Mass General Brigham, Somerville, MA 02145, USA Harvard Medical School, Boston, MA 02115, USA
Sebastien Vasey Department of Mathematics, Harvard University, Cambridge, MA 02138, USA
Victor M. Castro Research Information Science and Computing, Mass General Brigham, Somerville, MA 02145, USA
MaryKate E. Murphy Research Information Science and Computing, Mass General Brigham, Somerville, MA 02145, USA
Shawn N. Murphy Laboratory of Computer Science, Massachusetts General Hospital, Boston, MA 02144, USA Research Information Science and Computing, Mass General Brigham, Somerville, MA 02145, USA Harvard Medical School, Boston, MA 02115, USA Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA Department of Neurology, Massachusetts General Hospital, Boston, MA 02114, USA

Collapse

Liu Y, Zhang Q, Zhao G, Liu G, Liu Z. Deep Learning-Based Method of Diagnosing Hyperlipidemia and Providing Diagnostic Markers Automatically. Diabetes Metab Syndr Obes 2020;13:679-691. [PMID: 32210601 PMCID: PMC7073442 DOI: 10.2147/dmso.s242585] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Accepted: 02/26/2020] [Indexed: 12/18/2022] Open

Abstract

INTRODUCTION

The research of auxiliary diagnosis has always been one of the hotspots in the world. The implementation of auxiliary diagnosis support algorithm for medical text data faces challenges with interpretability and creditability. The improvement of clinical diagnostic techniques means not only the improvement of diagnostic accuracy but also the further study of diagnostic basis. Traditional research methods for diagnostic markers often require a large amount of time and economic costs. Research objects are often dozens of samples, and it is, therefore, difficult to synthesize large amounts of data. Therefore, the comprehensiveness and reliability of traditional methods have yet to be improved. Therefore, the establishment of a model that can automatically diagnose diseases and automatically provide a diagnostic basis at the same time has a positive effect on the improvement of medical diagnostic techniques.

METHODS

Here, we established an auxiliary diagnostic tool based on attention deep learning algorithm to diagnostic hyperlipemia and automatically predict the corresponding diagnostic markers using hematological parameters. In this paper, we not only demonstrated the ability of the proposed model to automatically diagnose diseases using text-based medical data, such as physiological parameters, but also demonstrated its ability to forecast disease diagnostic markers. Human physiological parameters are used as input to the model, and the doctor's diagnosis results as an output. Through the attention layer, the degree of attention of the model to different physiological parameters can be obtained, that is, the model provides a diagnostic basis.

RESULTS

It achieved 94% ACC, 97.48% AUC, 96% sensitivity and 92% specificity with the test dataset. All the above samples are drawn from clinical practice. Moreover, the model predicted the diagnostic markers of hyperlipidemia by the attention mechanism, and the results were fully agreeable to the golden criteria.

DISCUSSION

The auxiliary diagnosis system proposed in this paper not only achieves the accurate and robust performance, and can be used for the preliminary diagnosis of patients, but also showing its great potential to discover new diagnostic markers. Therefore, it not only can improve the efficiency of clinical diagnosis but also shorten the research period of researching a diagnosis basis to an extent. It has a positive significance to the development of the medical diagnosis level.

Collapse

Juhn Y, Liu H. Artificial intelligence approaches using natural language processing to advance EHR-based clinical research. J Allergy Clin Immunol 2020;145:463-469. [PMID: 31883846 PMCID: PMC7771189 DOI: 10.1016/j.jaci.2019.12.897] [Citation(s) in RCA: 96] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 12/18/2019] [Accepted: 12/19/2019] [Indexed: 01/17/2023]

Moon S, Liu S, Scott CG, Samudrala S, Abidian MM, Geske JB, Noseworthy PA, Shellum JL, Chaudhry R, Ommen SR, Nishimura RA, Liu H, Arruda-Olson AM. Automated extraction of sudden cardiac death risk factors in hypertrophic cardiomyopathy patients by natural language processing. Int J Med Inform 2019;128:32-38. [PMID: 31160009 DOI: 10.1016/j.ijmedinf.2019.05.008] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Revised: 01/19/2019] [Accepted: 05/11/2019] [Indexed: 01/12/2023]

Abstract

BACKGROUND

The management of hypertrophic cardiomyopathy (HCM) patients requires the knowledge of risk factors associated with sudden cardiac death (SCD). SCD risk factors such as syncope and family history of SCD (FH-SCD) as well as family history of HCM (FH-HCM) are documented in electronic health records (EHRs) as clinical narratives. Automated extraction of risk factors from clinical narratives by natural language processing (NLP) may expedite management workflow of HCM patients. The aim of this study was to develop and deploy NLP algorithms for automated extraction of syncope, FH-SCD, and FH-HCM from clinical narratives.

METHODS AND RESULTS

We randomly selected 200 patients from the Mayo HCM registry for development (n = 100) and testing (n = 100) of NLP algorithms for extraction of syncope, FH-SCD as well as FH-HCM from clinical narratives of EHRs. The clinical reference standard was manually abstracted by 2 independent annotators. Performance of NLP algorithms was compared to aggregation and summarization of data entries in the HCM registry for syncope, FH-SCD, and FH-HCM. We also compared the NLP algorithms with billing codes for syncope as well as responses to patient survey questions for FH-SCD and FH-HCM. These analyses demonstrated NLP had superior sensitivity (0.96 vs 0.39, p < 0.001) and comparable specificity (0.90 vs 0.92, p = 0.74) and PPV (0.90 vs 0.83, p = 0.37) compared to billing codes for syncope. For FH-SCD, NLP outperformed survey responses for all parameters (sensitivity: 0.91 vs 0.59, p = 0.002; specificity: 0.98 vs 0.50, p < 0.001; PPV: 0.97 vs 0.38, p < 0.001). NLP also achieved superior sensitivity (0.95 vs 0.24, p < 0.001) with comparable specificity (0.95 vs 1.0, p-value not calculable) and positive predictive value (PPV) (0.92 vs 1.0, p = 0.09) compared to survey responses for FH-HCM.

CONCLUSIONS

Automated extraction of syncope, FH-SCD and FH-HCM using NLP is feasible and has promise to increase efficiency of workflow for providers managing HCM patients.

Collapse

Chen D, Liu S, Kingsbury P, Sohn S, Storlie CB, Habermann EB, Naessens JM, Larson DW, Liu H. Deep learning and alternative learning strategies for retrospective real-world clinical data. NPJ Digit Med 2019;2:43. [PMID: 31304389 PMCID: PMC6550223 DOI: 10.1038/s41746-019-0122-0] [Citation(s) in RCA: 108] [Impact Index Per Article: 21.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2018] [Accepted: 05/09/2019] [Indexed: 02/06/2023] Open

Moskovitch R, Shahar Y, Wang F, Hripcsak G. Temporal biomedical data analytics. J Biomed Inform 2019;90:103092. [PMID: 30654029 PMCID: PMC9745669 DOI: 10.1016/j.jbi.2018.12.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2018] [Accepted: 12/24/2018] [Indexed: 02/07/2023]