1
|
Herrero-Zazo M, Fitzgerald T, Taylor V, Street H, Chaudhry AN, Bradley JR, Birney E, Keevil VL. Using machine learning to model older adult inpatient trajectories from electronic health records data. iScience 2022; 26:105876. [PMID: 36691609 PMCID: PMC9860485 DOI: 10.1016/j.isci.2022.105876] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 10/25/2022] [Accepted: 12/20/2022] [Indexed: 12/26/2022] Open
Abstract
Electronic Health Records (EHR) data can provide novel insights into inpatient trajectories. Blood tests and vital signs from de-identified patients' hospital admission episodes (AE) were represented as multivariate time-series (MVTS) to train unsupervised Hidden Markov Models (HMM) and represent each AE day as one of 17 states. All HMM states were clinically interpreted based on their patterns of MVTS variables and relationships with clinical information. Visualization differentiated patients progressing toward stable 'discharge-like' states versus those remaining at risk of inpatient mortality (IM). Chi-square tests confirmed these relationships (two states associated with IM; 12 states with ≥1 diagnosis). Logistic Regression and Random Forest (RF) models trained with MVTS data rather than states had higher prediction performances of IM, but results were comparable (best RF model AUC-ROC: MVTS data = 0.85; HMM states = 0.79). ML models extracted clinically interpretable signals from hospital data. The potential of ML to develop decision-support tools for EHR systems warrants investigation.
Collapse
Affiliation(s)
- Maria Herrero-Zazo
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- Department of Medicine for the Elderly, Addenbrooke’s Hospital, Cambridge University Hospitals NHS Foundation Trust, Hills Road, Cambridge CB2 0QQ, UK
| | - Tomas Fitzgerald
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Vince Taylor
- Cambridge Clinical Informatics, Addenbrooke’s Hospital, Cambridge University Hospitals NHS Foundation Trust, Hills Road, Cambridge CB2 0QQ, UK
| | - Helen Street
- Research and Development, Cambridge University Hospitals NHS Foundation Trust, Hills Road, Cambridge CB2 0QQ, UK
| | - Afzal N. Chaudhry
- Department of Medicine, University of Cambridge, Addenbrooke’s Hospital, Hills Road, Cambridge CB2 0QQ, UK
- NIHR Cambridge Biomedical Research Centre, Cambridge Biomedical Campus, Cambridge CB2 0QQ, UK
| | - John R. Bradley
- Department of Medicine, University of Cambridge, Addenbrooke’s Hospital, Hills Road, Cambridge CB2 0QQ, UK
- NIHR Cambridge Biomedical Research Centre, Cambridge Biomedical Campus, Cambridge CB2 0QQ, UK
| | - Ewan Birney
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- Corresponding author
| | - Victoria L. Keevil
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- Department of Medicine for the Elderly, Addenbrooke’s Hospital, Cambridge University Hospitals NHS Foundation Trust, Hills Road, Cambridge CB2 0QQ, UK
- Department of Medicine, University of Cambridge, Addenbrooke’s Hospital, Hills Road, Cambridge CB2 0QQ, UK
- Corresponding author
| |
Collapse
|
2
|
Haug N, Deischinger C, Gyimesi M, Kautzky-Willer A, Thurner S, Klimek P. High-risk multimorbidity patterns on the road to cardiovascular mortality. BMC Med 2020; 18:44. [PMID: 32151252 PMCID: PMC7063814 DOI: 10.1186/s12916-020-1508-1] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Accepted: 02/03/2020] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Multimorbidity, the co-occurrence of two or more diseases in one patient, is a frequent phenomenon. Understanding how different diseases condition each other over the lifetime of a patient could significantly contribute to personalised prevention efforts. However, most of our current knowledge on the long-term development of the health of patients (their disease trajectories) is either confined to narrow time spans or specific (sets of) diseases. Here, we aim to identify decisive events that potentially determine the future disease progression of patients. METHODS Health states of patients are described by algorithmically identified multimorbidity patterns (groups of included or excluded diseases) in a population-wide analysis of 9,000,000 patient histories of hospital diagnoses observed over 17 years. Over time, patients might acquire new diagnoses that change their health state; they describe a disease trajectory. We measure the age- and sex-specific risks for patients that they will acquire certain sets of diseases in the future depending on their current health state. RESULTS In the present analysis, the population is described by a set of 132 different multimorbidity patterns. For elderly patients, we find 3 groups of multimorbidity patterns associated with low (yearly in-hospital mortality of 0.2-0.3%), medium (0.3-1%) and high in-hospital mortality (2-11%). We identify combinations of diseases that significantly increase the risk to reach the high-mortality health states in later life. For instance, in men (women) aged 50-59 diagnosed with diabetes and hypertension, the risk for moving into the high-mortality region within 1 year is increased by the factor of 1.96 ± 0.11 (2.60 ± 0.18) compared with all patients of the same age and sex, respectively, and by the factor of 2.09 ± 0.12 (3.04 ± 0.18) if additionally diagnosed with metabolic disorders. CONCLUSIONS Our approach can be used both to forecast future disease burdens, as well as to identify the critical events in the careers of patients which strongly determine their disease progression, therefore constituting targets for efficient prevention measures. We show that the risk for cardiovascular diseases increases significantly more in females than in males when diagnosed with diabetes, hypertension and metabolic disorders.
Collapse
Affiliation(s)
- Nina Haug
- Section for Science of Complex Systems, CeMSIIS, Medical University of Vienna, Spitalgasse 23, Vienna, A-1090, Austria.,Complexity Science Hub Vienna, Josefstädter Straße 39, Vienna, A-1080, Austria
| | - Carola Deischinger
- Gender Medicine Unit, Division of Endocrinology and Metabolism, Department of Internal Medicine III, Medical University of Vienna, Spitalgasse 23, Vienna, A-1090, Austria
| | - Michael Gyimesi
- Gesundheit Österreich GmbH, Stubenring 6, Vienna, A-1010, Austria
| | - Alexandra Kautzky-Willer
- Gender Medicine Unit, Division of Endocrinology and Metabolism, Department of Internal Medicine III, Medical University of Vienna, Spitalgasse 23, Vienna, A-1090, Austria
| | - Stefan Thurner
- Section for Science of Complex Systems, CeMSIIS, Medical University of Vienna, Spitalgasse 23, Vienna, A-1090, Austria.,Complexity Science Hub Vienna, Josefstädter Straße 39, Vienna, A-1080, Austria.,IIASA, Schloßplatz 1, Laxenburg, A-2361, Austria.,Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, 85701, NM, USA
| | - Peter Klimek
- Section for Science of Complex Systems, CeMSIIS, Medical University of Vienna, Spitalgasse 23, Vienna, A-1090, Austria. .,Complexity Science Hub Vienna, Josefstädter Straße 39, Vienna, A-1080, Austria.
| |
Collapse
|
3
|
Arandjelovic O. Intuitive and interpretable visual communication of a complex statistical model of disease progression and risk. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2018; 2017:4199-4202. [PMID: 29060823 DOI: 10.1109/embc.2017.8037782] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Computer science and machine learning in particular are increasingly lauded for their potential to aid medical practice. However, the highly technical nature of the state of the art techniques can be a major obstacle in their usability by health care professionals and thus, their adoption and actual practical benefit. In this paper we describe a software tool which focuses on the visualization of predictions made by a recently developed method which leverages data in the form of large scale electronic records for making diagnostic predictions. Guided by risk predictions, our tool allows the user to explore interactively different diagnostic trajectories, or display cumulative long term prognostics, in an intuitive and easily interpretable manner.
Collapse
|
4
|
Pham T, Tran T, Phung D, Venkatesh S. Predicting healthcare trajectories from medical records: A deep learning approach. J Biomed Inform 2017; 69:218-229. [PMID: 28410981 DOI: 10.1016/j.jbi.2017.04.001] [Citation(s) in RCA: 141] [Impact Index Per Article: 20.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2016] [Revised: 03/23/2017] [Accepted: 04/01/2017] [Indexed: 11/18/2022]
Abstract
Personalized predictive medicine necessitates the modeling of patient illness and care processes, which inherently have long-term temporal dependencies. Healthcare observations, stored in electronic medical records are episodic and irregular in time. We introduce DeepCare, an end-to-end deep dynamic neural network that reads medical records, stores previous illness history, infers current illness states and predicts future medical outcomes. At the data level, DeepCare represents care episodes as vectors and models patient health state trajectories by the memory of historical records. Built on Long Short-Term Memory (LSTM), DeepCare introduces methods to handle irregularly timed events by moderating the forgetting and consolidation of memory. DeepCare also explicitly models medical interventions that change the course of illness and shape future medical risk. Moving up to the health state level, historical and present health states are then aggregated through multiscale temporal pooling, before passing through a neural network that estimates future outcomes. We demonstrate the efficacy of DeepCare for disease progression modeling, intervention recommendation, and future risk prediction. On two important cohorts with heavy social and economic burden - diabetes and mental health - the results show improved prediction accuracy.
Collapse
Affiliation(s)
- Trang Pham
- Center for Pattern Recognition and Data Analytics, Deakin University Geelong, Australia.
| | - Truyen Tran
- Center for Pattern Recognition and Data Analytics, Deakin University Geelong, Australia
| | - Dinh Phung
- Center for Pattern Recognition and Data Analytics, Deakin University Geelong, Australia
| | - Svetha Venkatesh
- Center for Pattern Recognition and Data Analytics, Deakin University Geelong, Australia
| |
Collapse
|
5
|
Andrei V, Arandjelovic O. Identification of promising research directions using machine learning aided medical literature analysis. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2017; 2016:2471-2474. [PMID: 28268825 DOI: 10.1109/embc.2016.7591231] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The rapidly expanding corpus of medical research literature presents major challenges in the understanding of previous work, the extraction of maximum information from collected data, and the identification of promising research directions. We present a case for the use of advanced machine learning techniques as an aide in this task and introduce a novel methodology that is shown to be capable of extracting meaningful information from large longitudinal corpora, and of tracking complex temporal changes within it.
Collapse
|
6
|
Nguyen P, Tran T, Wickramasinghe N, Venkatesh S. $\mathtt {Deepr}$: A Convolutional Net for Medical Records. IEEE J Biomed Health Inform 2016; 21:22-30. [PMID: 27913366 DOI: 10.1109/jbhi.2016.2633963] [Citation(s) in RCA: 118] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Feature engineering remains a major bottleneck when creating predictive systems from electronic medical records. At present, an important missing element is detecting predictive regular clinical motifs from irregular episodic records. We present Deepr (short for Deep record), a new end-to-end deep learning system that learns to extract features from medical records and predicts future risk automatically. Deepr transforms a record into a sequence of discrete elements separated by coded time gaps and hospital transfers. On top of the sequence is a convolutional neural net that detects and combines predictive local clinical motifs to stratify the risk. Deepr permits transparent inspection and visualization of its inner working. We validate Deepr on hospital data to predict unplanned readmission after discharge. Deepr achieves superior accuracy compared to traditional techniques, detects meaningful clinical motifs, and uncovers the underlying structure of the disease and intervention space.
Collapse
|
7
|
Andrei V, Arandjelović O. Complex temporal topic evolution modelling using the Kullback-Leibler divergence and the Bhattacharyya distance. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2016; 2016:16. [PMID: 27746813 PMCID: PMC5042987 DOI: 10.1186/s13637-016-0050-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/06/2016] [Accepted: 09/12/2016] [Indexed: 11/10/2022]
Abstract
The rapidly expanding corpus of medical research literature presents major challenges in the understanding of previous work, the extraction of maximum information from collected data, and the identification of promising research directions. We present a case for the use of advanced machine learning techniques as an aide in this task and introduce a novel methodology that is shown to be capable of extracting meaningful information from large longitudinal corpora and of tracking complex temporal changes within it. Our framework is based on (i) the discretization of time into epochs, (ii) epoch-wise topic discovery using a hierarchical Dirichlet process-based model, and (iii) a temporal similarity graph which allows for the modelling of complex topic changes. More specifically, this is the first work that discusses and distinguishes between two groups of particularly challenging topic evolution phenomena: topic splitting and speciation and topic convergence and merging, in addition to the more widely recognized emergence and disappearance and gradual evolution. The proposed framework is evaluated on a public medical literature corpus.
Collapse
Affiliation(s)
- Victor Andrei
- School of Computer Science, University of St Andrews, St Andrews KY16 9SX, Fife, Scotland, UK
| | - Ognjen Arandjelović
- School of Computer Science, University of St Andrews, St Andrews KY16 9SX, Fife, Scotland, UK
| |
Collapse
|
8
|
Vasiljeva I, Arandjelovic O. Towards sophisticated learning from EHRs: increasing prediction specificity and accuracy using clinically meaningful risk criteria. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2016; 2016:2452-2455. [PMID: 28268820 DOI: 10.1109/embc.2016.7591226] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Computer based analysis of Electronic Health Records (EHRs) has the potential to provide major novel insights of benefit both to specific individuals in the context of personalized medicine, as well as on the level of population-wide health care and policy. The present paper introduces a novel algorithm that uses machine learning for the discovery of longitudinal patterns in the diagnoses of diseases. Two key technical novelties are introduced: one in the form of a novel learning paradigm which enables greater learning specificity, and another in the form of a risk driven identification of confounding diagnoses. We present a series of experiments which demonstrate the effectiveness of the proposed techniques, and which reveal novel insights regarding the most promising future research directions.
Collapse
|
9
|
Lopez-Campos GH, Martin-Sanchez F, Gray K. Comment on 'Discovering hospital admission patterns using models learnt from electronic hospital records'. The importance of using the right codes. Bioinformatics 2016; 32:2079-80. [PMID: 27153706 DOI: 10.1093/bioinformatics/btw078] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2015] [Accepted: 12/22/2015] [Indexed: 11/12/2022] Open
Abstract
CONTACT guillermo.lopez@unimelb.edu.au.
Collapse
Affiliation(s)
- Guillermo H Lopez-Campos
- Health and Biomedical Informatics Centre, The University of Melbourne, Parkville, VIC, Australia and
| | - Fernando Martin-Sanchez
- Department of Healthcare Policy and Research. Division of Health Informatics. Weill Cornell Medicine, New York, NY, USA
| | - Kathleen Gray
- Health and Biomedical Informatics Centre, The University of Melbourne, Parkville, VIC, Australia and
| |
Collapse
|
10
|
Arandjelović O. On the discovery of hospital admission patterns-a clarification. Bioinformatics 2016; 32:2078. [PMID: 26819471 DOI: 10.1093/bioinformatics/btw049] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2016] [Accepted: 01/20/2016] [Indexed: 11/12/2022] Open
Abstract
CONTACT ognjen.arandjelvoic@gmail.com.
Collapse
|
11
|
DeepCare: A Deep Dynamic Memory Model for Predictive Medicine. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING 2016. [DOI: 10.1007/978-3-319-31750-2_3] [Citation(s) in RCA: 102] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
|