51
|
Zhang B, Shi H, Wang H. Machine Learning and AI in Cancer Prognosis, Prediction, and Treatment Selection: A Critical Approach. J Multidiscip Healthc 2023; 16:1779-1791. [PMID: 37398894 PMCID: PMC10312208 DOI: 10.2147/jmdh.s410301] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 06/12/2023] [Indexed: 07/04/2023] Open
Abstract
Cancer is a leading cause of morbidity and mortality worldwide. While progress has been made in the diagnosis, prognosis, and treatment of cancer patients, individualized and data-driven care remains a challenge. Artificial intelligence (AI), which is used to predict and automate many cancers, has emerged as a promising option for improving healthcare accuracy and patient outcomes. AI applications in oncology include risk assessment, early diagnosis, patient prognosis estimation, and treatment selection based on deep knowledge. Machine learning (ML), a subset of AI that enables computers to learn from training data, has been highly effective at predicting various types of cancer, including breast, brain, lung, liver, and prostate cancer. In fact, AI and ML have demonstrated greater accuracy in predicting cancer than clinicians. These technologies also have the potential to improve the diagnosis, prognosis, and quality of life of patients with various illnesses, not just cancer. Therefore, it is important to improve current AI and ML technologies and to develop new programs to benefit patients. This article examines the use of AI and ML algorithms in cancer prediction, including their current applications, limitations, and future prospects.
Collapse
Affiliation(s)
- Bo Zhang
- Jinling Institute of Science and Technology, Nanjing City, Jiangsu Province, People’s Republic of China
| | - Huiping Shi
- Jinling Institute of Science and Technology, Nanjing City, Jiangsu Province, People’s Republic of China
| | - Hongtao Wang
- School of Life Science, Tonghua Normal University, Tonghua City, Jilin Province, People’s Republic of China
| |
Collapse
|
52
|
Wang M, Sushil M, Miao BY, Butte AJ. Bottom-up and top-down paradigms of artificial intelligence research approaches to healthcare data science using growing real-world big data. J Am Med Inform Assoc 2023; 30:1323-1332. [PMID: 37187158 PMCID: PMC10280344 DOI: 10.1093/jamia/ocad085] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 04/03/2023] [Accepted: 05/04/2023] [Indexed: 05/17/2023] Open
Abstract
OBJECTIVES As the real-world electronic health record (EHR) data continue to grow exponentially, novel methodologies involving artificial intelligence (AI) are becoming increasingly applied to enable efficient data-driven learning and, ultimately, to advance healthcare. Our objective is to provide readers with an understanding of evolving computational methods and help in deciding on methods to pursue. TARGET AUDIENCE The sheer diversity of existing methods presents a challenge for health scientists who are beginning to apply computational methods to their research. Therefore, this tutorial is aimed at scientists working with EHR data who are early entrants into the field of applying AI methodologies. SCOPE This manuscript describes the diverse and growing AI research approaches in healthcare data science and categorizes them into 2 distinct paradigms, the bottom-up and top-down paradigms to provide health scientists venturing into artificial intelligent research with an understanding of the evolving computational methods and help in deciding on methods to pursue through the lens of real-world healthcare data.
Collapse
Affiliation(s)
- Michelle Wang
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, California, USA
| | - Madhumita Sushil
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, California, USA
| | - Brenda Y Miao
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, California, USA
| | - Atul J Butte
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, California, USA
- Department of Pediatrics, University of California, San Francisco, San Francisco, California, USA
| |
Collapse
|
53
|
Eskofier BM, Klucken J. Predictive Models for Health Deterioration: Understanding Disease Pathways for Personalized Medicine. Annu Rev Biomed Eng 2023; 25:131-156. [PMID: 36854259 DOI: 10.1146/annurev-bioeng-110220-030247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2023]
Abstract
Artificial intelligence (AI) and machine learning (ML) methods are currently widely employed in medicine and healthcare. A PubMed search returns more than 100,000 articles on these topics published between 2018 and 2022 alone. Notwithstanding several recent reviews in various subfields of AI and ML in medicine, we have yet to see a comprehensive review around the methods' use in longitudinal analysis and prediction of an individual patient's health status within a personalized disease pathway. This review seeks to fill that gap. After an overview of the AI and ML methods employed in this field and of specific medical applications of models of this type, the review discusses the strengths and limitations of current studies and looks ahead to future strands of research in this field. We aim to enable interested readers to gain a detailed impression of the research currently available and accordingly plan future work around predictive models for deterioration in health status.
Collapse
Affiliation(s)
- Bjoern M Eskofier
- Machine Learning and Data Analytics Lab, Department of Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany;
| | - Jochen Klucken
- Digital Medicine Group, Luxembourg Centre for Systems Biomedicine, Université du Luxembourg, Belvaux, Luxembourg
- Digital Medicine Group, Department of Precision Health, Luxembourg Institute of Health, Strassen, Luxembourg
- Centre Hospitalier de Luxembourg, Luxembourg City, Luxembourg
| |
Collapse
|
54
|
Hu YH, Hung JH, Hu LY, Huang SY, Shen CC. An analysis of Chinese nursing electronic medical records to predict violence in psychiatric inpatients using text mining and machine learning techniques. PLoS One 2023; 18:e0286347. [PMID: 37285344 DOI: 10.1371/journal.pone.0286347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Accepted: 05/14/2023] [Indexed: 06/09/2023] Open
Abstract
BACKGROUND The prevalence of violence in acute psychiatric wards is a critical concern. According to a meta-analysis investigating violence in psychiatric inpatient units, researchers estimated that approximately 17% of inpatients commit one or more acts of violence during their stay. Inpatient violence negatively affects health-care providers and patients and may contribute to high staff turnover. Therefore, predicting which psychiatric inpatients will commit violence is of considerable clinical significance. OBJECTIVE The present study aimed to estimate the violence rate for psychiatric inpatients and establish a predictive model for violence in psychiatric inpatients. METHODS We collected the structured and unstructured data from Chinese nursing electronic medical records (EMRs) for the violence prediction. The data was obtained from the psychiatry department of a regional hospital in southern Taiwan, covering the period between January 2008 and December 2018. Several text mining and machine learning techniques were employed to analyze the data. RESULTS The results demonstrated that the rate of violence in psychiatric inpatients is 19.7%. The patients with violence in psychiatric wards were generally younger, had a more violent history, and were more likely to be unmarried. Furthermore, our study supported the feasibility of predicting aggressive incidents in psychiatric wards by using nursing EMRs and the proposed method can be incorporated into routine clinical practice to enable early prediction of inpatient violence. CONCLUSIONS Our findings may provide clinicians with a new basis for judgment of the risk of violence in psychiatric wards.
Collapse
Affiliation(s)
- Ya-Han Hu
- Department of Information Management, National Central University, Taoyuan City, Taiwan
- Asian Institute for Impact Measurement and Management, National Central University, Taoyuan City, Taiwan
| | - Jeng-Hsiu Hung
- Department of Obstetrics and Gynecology, Taipei Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation, Taipei, Taiwan
- School of Medicine, Tzu Chi University, Hualien, Taiwan
| | - Li-Yu Hu
- School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
- Department of Psychiatry, Taipei Veterans General Hospital, Taipei, Taiwan
| | - Sheng-Yun Huang
- Department of Psychiatry, Chiayi Branch, Taichung Veterans General Hospital, Chiayi, Taiwan
| | - Cheng-Che Shen
- School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
- Department of Psychiatry, Chiayi Branch, Taichung Veterans General Hospital, Chiayi, Taiwan
- Center for Innovative Research on Aging Society (CIRAS), National Chung Cheng University, Minxiong, Taiwan
- Department of Post-Baccalaureate Medicine, College of Medicine, National Chung Hsing University, Taichung, Taiwan
| |
Collapse
|
55
|
La Cava WG, Lee PC, Ajmal I, Ding X, Solanki P, Cohen JB, Moore JH, Herman DS. A flexible symbolic regression method for constructing interpretable clinical prediction models. NPJ Digit Med 2023; 6:107. [PMID: 37277550 DOI: 10.1038/s41746-023-00833-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Accepted: 05/05/2023] [Indexed: 06/07/2023] Open
Abstract
Machine learning (ML) models trained for triggering clinical decision support (CDS) are typically either accurate or interpretable but not both. Scaling CDS to the panoply of clinical use cases while mitigating risks to patients will require many ML models be intuitively interpretable for clinicians. To this end, we adapted a symbolic regression method, coined the feature engineering automation tool (FEAT), to train concise and accurate models from high-dimensional electronic health record (EHR) data. We first present an in-depth application of FEAT to classify hypertension, hypertension with unexplained hypokalemia, and apparent treatment-resistant hypertension (aTRH) using EHR data for 1200 subjects receiving longitudinal care in a large healthcare system. FEAT models trained to predict phenotypes adjudicated by chart review had equivalent or higher discriminative performance (p < 0.001) and were at least three times smaller (p < 1 × 10-6) than other potentially interpretable models. For aTRH, FEAT generated a six-feature, highly discriminative (positive predictive value = 0.70, sensitivity = 0.62), and clinically intuitive model. To assess the generalizability of the approach, we tested FEAT on 25 benchmark clinical phenotyping tasks using the MIMIC-III critical care database. Under comparable dimensionality constraints, FEAT's models exhibited higher area under the receiver-operating curve scores than penalized linear models across tasks (p < 6 × 10-6). In summary, FEAT can train EHR prediction models that are both intuitively interpretable and accurate, which should facilitate safe and effective scaling of ML-triggered CDS to the panoply of potential clinical use cases and healthcare practices.
Collapse
Affiliation(s)
- William G La Cava
- Computational Health Informatics Program, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
| | - Paul C Lee
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Imran Ajmal
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Xiruo Ding
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Priyanka Solanki
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Jordana B Cohen
- Division of Renal-Electrolyte and Hypertension, Department of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - Jason H Moore
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - Daniel S Herman
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
56
|
Estiri H, Azhir A, Blacker DL, Ritchie CS, Patel CJ, Murphy SN. Temporal characterization of Alzheimer's Disease with sequences of clinical records. EBioMedicine 2023; 92:104629. [PMID: 37247495 DOI: 10.1016/j.ebiom.2023.104629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 05/05/2023] [Accepted: 05/10/2023] [Indexed: 05/31/2023] Open
Abstract
BACKGROUND Alzheimer's Disease (AD) is a complex clinical phenotype with unprecedented social and economic tolls on an ageing global population. Real-world data (RWD) from electronic health records (EHRs) offer opportunities to accelerate precision drug development and scale epidemiological research on AD. A precise characterization of AD cohorts is needed to address the noise abundant in RWD. METHODS We conducted a retrospective cohort study to develop and test computational models for AD cohort identification using clinical data from 8 Massachusetts healthcare systems. We mined temporal representations from EHR data using the transitive sequential pattern mining algorithm (tSPM) to train and validate our models. We then tested our models against a held-out test set from a review of medical records to adjudicate the presence of AD. We trained two classes of Machine Learning models, using Gradient Boosting Machine (GBM), to compare the utility of AD diagnosis records versus the tSPM temporal representations (comprising sequences of diagnosis and medication observations) from electronic medical records for characterizing AD cohorts. FINDINGS In a group of 4985 patients, we identified 219 tSPM temporal representations (i.e., transitive sequences) of medical records for constructing the best classification models. The models with sequential features improved AD classification by a magnitude of 3-16 percent over the use of AD diagnosis codes alone. The computed cohort included 663 patients, 35 of whom had no record of AD. Six groups of tSPM sequences were identified for characterizing the AD cohorts. INTERPRETATION We present sequential patterns of diagnosis and medication codes from electronic medical records, as digital markers of Alzheimer's Disease. Classification algorithms developed on sequential patterns can replace standard features from EHRs to enrich phenotype modelling. FUNDING National Institutes of Health: the National Institute on Aging (RF1AG074372) and the National Institute of Allergy and Infectious Diseases (R01AI165535).
Collapse
Affiliation(s)
- Hossein Estiri
- Department of Medicine, Massachusetts General Hospital, Boston, MA, USA.
| | - Alaleh Azhir
- Department of Medicine, Massachusetts General Hospital, Boston, MA, USA; Harvard Medical School, Harvard-MIT Program in Health Sciences and Technology, USA
| | - Deborah L Blacker
- Department of Psychiatry, Massachusetts General Hospital, Boston, MA, USA
| | | | - Chirag J Patel
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Shawn N Murphy
- Department of Neurology, Massachusetts General Hospital, Boston, MA, USA
| |
Collapse
|
57
|
Murali L, Gopakumar G, Viswanathan DM, Nedungadi P. Towards electronic health record-based medical knowledge graph construction, completion, and applications: A literature study. J Biomed Inform 2023:104403. [PMID: 37230406 DOI: 10.1016/j.jbi.2023.104403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 05/16/2023] [Accepted: 05/19/2023] [Indexed: 05/27/2023]
Abstract
With the growth of data and intelligent technologies, the healthcare sector opened numerous technology that enabled services for patients, clinicians, and researchers. One major hurdle in achieving state-of-the-art results in health informatics is domain-specific terminologies and their semantic complexities. A knowledge graph crafted from medical concepts, events, and relationships acts as a medical semantic network to extract new links and hidden patterns from health data sources. Current medical knowledge graph construction studies are limited to generic techniques and opportunities and focus less on exploiting real-world data sources in knowledge graph construction. A knowledge graph constructed from Electronic Health Records (EHR) data obtains real-world data from healthcare records. It ensures better results in subsequent tasks like knowledge extraction and inference, knowledge graph completion, and medical knowledge graph applications such as diagnosis predictions, clinical recommendations, and clinical decision support. This review critically analyses existing works on medical knowledge graphs that used EHR data as the data source at (i) representation level, (ii) extraction level (iii) completion level. In this investigation, we found that EHR-based knowledge graph construction involves challenges such as high complexity and dimensionality of data, lack of knowledge fusion, and dynamic update of the knowledge graph. In addition, the study presents possible ways to tackle the challenges identified. Our findings conclude that future research should focus on knowledge graph integration and knowledge graph completion challenges.
Collapse
Affiliation(s)
- Lino Murali
- Center for Research in Analytics and Technologies for Education (CREATE), Amrita Vishwa Vidyapeetham, Amritapuri, Kollam, 690525, Kerala, India; Division of Information technology, School of Engineering, Cochin University of Science and Technology, Kochi, 682022, Kerala, India
| | - G Gopakumar
- Department of Computer Science and Engineering, School of Computing, Amrita Vishwa Vidyapeetham, Amritapuri, Kollam, 690525, Kerala, India
| | - Daleesha M Viswanathan
- Division of Information technology, School of Engineering, Cochin University of Science and Technology, Kochi, 682022, Kerala, India
| | - Prema Nedungadi
- Center for Research in Analytics and Technologies for Education (CREATE), Amrita Vishwa Vidyapeetham, Amritapuri, Kollam, 690525, Kerala, India; Department of Computer Science and Engineering, School of Computing, Amrita Vishwa Vidyapeetham, Amritapuri, Kollam, 690525, Kerala, India.
| |
Collapse
|
58
|
Gan Z, Zhou D, Rush E, Panickan VA, Ho YL, Ostrouchov G, Xu Z, Shen S, Xiong X, Greco KF, Hong C, Bonzel CL, Wen J, Costa L, Cai T, Begoli E, Xia Z, Gaziano JM, Liao KP, Cho K, Cai T, Lu J. ARCH: Large-scale Knowledge Graph via Aggregated Narrative Codified Health Records Analysis. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.05.14.23289955. [PMID: 37293026 PMCID: PMC10246054 DOI: 10.1101/2023.05.14.23289955] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Objective Electronic health record (EHR) systems contain a wealth of clinical data stored as both codified data and free-text narrative notes, covering hundreds of thousands of clinical concepts available for research and clinical care. The complex, massive, heterogeneous, and noisy nature of EHR data imposes significant challenges for feature representation, information extraction, and uncertainty quantification. To address these challenges, we proposed an efficient Aggregated naRrative Codified Health (ARCH) records analysis to generate a large-scale knowledge graph (KG) for a comprehensive set of EHR codified and narrative features. Methods The ARCH algorithm first derives embedding vectors from a co-occurrence matrix of all EHR concepts and then generates cosine similarities along with associated p -values to measure the strength of relatedness between clinical features with statistical certainty quantification. In the final step, ARCH performs a sparse embedding regression to remove indirect linkage between entity pairs. We validated the clinical utility of the ARCH knowledge graph, generated from 12.5 million patients in the Veterans Affairs (VA) healthcare system, through downstream tasks including detecting known relationships between entity pairs, predicting drug side effects, disease phenotyping, as well as sub-typing Alzheimer's disease patients. Results ARCH produces high-quality clinical embeddings and KG for over 60,000 EHR concepts, as visualized in the R-shiny powered web-API (https://celehs.hms.harvard.edu/ARCH/). The ARCH embeddings attained an average area under the ROC curve (AUC) of 0.926 and 0.861 for detecting pairs of similar EHR concepts when the concepts are mapped to codified data and to NLP data; and 0.810 (codified) and 0.843 (NLP) for detecting related pairs. Based on the p -values computed by ARCH, the sensitivity of detecting similar and related entity pairs are 0.906 and 0.888 under false discovery rate (FDR) control of 5%. For detecting drug side effects, the cosine similarity based on the ARCH semantic representations achieved an AUC of 0.723 while the AUC improved to 0.826 after few-shot training via minimizing the loss function on the training data set. Incorporating NLP data substantially improved the ability to detect side effects in the EHR. For example, based on unsupervised ARCH embeddings, the power of detecting drug-side effects pairs when using codified data only was 0.15, much lower than the power of 0.51 when using both codified and NLP concepts. Compared to existing large-scale representation learning methods including PubmedBERT, BioBERT and SAPBERT, ARCH attains the most robust performance and substantially higher accuracy in detecting these relationships. Incorporating ARCH selected features in weakly supervised phenotyping algorithms can improve the robustness of algorithm performance, especially for diseases that benefit from NLP features as supporting evidence. For example, the phenotyping algorithm for depression attained an AUC of 0.927 when using ARCH selected features but only 0.857 when using codified features selected via the KESER network[1]. In addition, embeddings and knowledge graphs generated from the ARCH network were able to cluster AD patients into two subgroups, where the fast progression subgroup had a much higher mortality rate. Conclusions The proposed ARCH algorithm generates large-scale high-quality semantic representations and knowledge graph for both codified and NLP EHR features, useful for a wide range of predictive modeling tasks.
Collapse
Affiliation(s)
| | - Doudou Zhou
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Everett Rush
- Oak Ridge National Laboratory, Oak Ridge, TN USA
| | - Vidul A Panickan
- Harvard Medical School, Boston, MA, USA
- VA Boston Healthcare System, Boston, MA, USA
| | - Yuk-Lam Ho
- VA Boston Healthcare System, Boston, MA, USA
| | | | - Zhiwei Xu
- University of Michigan, Ann Arbor, MI, USA
| | - Shuting Shen
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Xin Xiong
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | | | | | - Clara-Lea Bonzel
- Harvard Medical School, Boston, MA, USA
- VA Boston Healthcare System, Boston, MA, USA
| | - Jun Wen
- Harvard Medical School, Boston, MA, USA
| | | | - Tianrun Cai
- VA Boston Healthcare System, Boston, MA, USA
- Brigham and Women's Hospital, Boston, MA, USA
| | - Edmon Begoli
- Oak Ridge National Laboratory, Oak Ridge, TN USA
| | - Zongqi Xia
- University of Pittsburgh, Pittsburgh, USA
| | - J Michael Gaziano
- Harvard Medical School, Boston, MA, USA
- VA Boston Healthcare System, Boston, MA, USA
- Brigham and Women's Hospital, Boston, MA, USA
| | - Katherine P Liao
- VA Boston Healthcare System, Boston, MA, USA
- Brigham and Women's Hospital, Boston, MA, USA
| | - Kelly Cho
- Harvard Medical School, Boston, MA, USA
- VA Boston Healthcare System, Boston, MA, USA
- Brigham and Women's Hospital, Boston, MA, USA
| | - Tianxi Cai
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
- VA Boston Healthcare System, Boston, MA, USA
| | - Junwei Lu
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
- VA Boston Healthcare System, Boston, MA, USA
| |
Collapse
|
59
|
Soman K, Nelson CA, Cerono G, Goldman SM, Baranzini SE, Brown EG. Early detection of Parkinson's disease through enriching the electronic health record using a biomedical knowledge graph. Front Med (Lausanne) 2023; 10:1081087. [PMID: 37250641 PMCID: PMC10217780 DOI: 10.3389/fmed.2023.1081087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Accepted: 04/18/2023] [Indexed: 05/31/2023] Open
Abstract
Introduction Early diagnosis of Parkinson's disease (PD) is important to identify treatments to slow neurodegeneration. People who develop PD often have symptoms before the disease manifests and may be coded as diagnoses in the electronic health record (EHR). Methods To predict PD diagnosis, we embedded EHR data of patients onto a biomedical knowledge graph called Scalable Precision medicine Open Knowledge Engine (SPOKE) and created patient embedding vectors. We trained and validated a classifier using these vectors from 3,004 PD patients, restricting records to 1, 3, and 5 years before diagnosis, and 457,197 non-PD group. Results The classifier predicted PD diagnosis with moderate accuracy (AUC = 0.77 ± 0.06, 0.74 ± 0.05, 0.72 ± 0.05 at 1, 3, and 5 years) and performed better than other benchmark methods. Nodes in the SPOKE graph, among cases, revealed novel associations, while SPOKE patient vectors revealed the basis for individual risk classification. Discussion The proposed method was able to explain the clinical predictions using the knowledge graph, thereby making the predictions clinically interpretable. Through enriching EHR data with biomedical associations, SPOKE may be a cost-efficient and personalized way to predict PD diagnosis years before its occurrence.
Collapse
Affiliation(s)
- Karthik Soman
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, United States
| | - Charlotte A. Nelson
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, United States
| | - Gabriel Cerono
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, United States
| | - Samuel M. Goldman
- Division of Occupational and Environmental Medicine, University of California, San Francisco, San Francisco, CA, United States
| | - Sergio E. Baranzini
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, United States
| | - Ethan G. Brown
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, United States
| |
Collapse
|
60
|
Ledziński Ł, Grześk G. Artificial Intelligence Technologies in Cardiology. J Cardiovasc Dev Dis 2023; 10:jcdd10050202. [PMID: 37233169 DOI: 10.3390/jcdd10050202] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 05/03/2023] [Accepted: 05/04/2023] [Indexed: 05/27/2023] Open
Abstract
As the world produces exabytes of data, there is a growing need to find new methods that are more suitable for dealing with complex datasets. Artificial intelligence (AI) has significant potential to impact the healthcare industry, which is already on the road to change with the digital transformation of vast quantities of information. The implementation of AI has already achieved success in the domains of molecular chemistry and drug discoveries. The reduction in costs and in the time needed for experiments to predict the pharmacological activities of new molecules is a milestone in science. These successful applications of AI algorithms provide hope for a revolution in healthcare systems. A significant part of artificial intelligence is machine learning (ML), of which there are three main types-supervised learning, unsupervised learning, and reinforcement learning. In this review, the full scope of the AI workflow is presented, with explanations of the most-often-used ML algorithms and descriptions of performance metrics for both regression and classification. A brief introduction to explainable artificial intelligence (XAI) is provided, with examples of technologies that have developed for XAI. We review important AI implementations in cardiology for supervised, unsupervised, and reinforcement learning and natural language processing, emphasizing the used algorithm. Finally, we discuss the need to establish legal, ethical, and methodical requirements for the deployment of AI models in medicine.
Collapse
Affiliation(s)
- Łukasz Ledziński
- Department of Cardiology and Clinical Pharmacology, Faculty of Health Sciences, Collegium Medicum in Bydgoszcz, Nicolaus Copernicus University in Toruń, Ujejskiego 75, 85-168 Bydgoszcz, Poland
| | - Grzegorz Grześk
- Department of Cardiology and Clinical Pharmacology, Faculty of Health Sciences, Collegium Medicum in Bydgoszcz, Nicolaus Copernicus University in Toruń, Ujejskiego 75, 85-168 Bydgoszcz, Poland
| |
Collapse
|
61
|
Houssein EH, Mohamed RE, Ali AA. Heart disease risk factors detection from electronic health records using advanced NLP and deep learning techniques. Sci Rep 2023; 13:7173. [PMID: 37138014 PMCID: PMC10156668 DOI: 10.1038/s41598-023-34294-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Accepted: 04/27/2023] [Indexed: 05/05/2023] Open
Abstract
Heart disease remains the major cause of death, despite recent improvements in prediction and prevention. Risk factor identification is the main step in diagnosing and preventing heart disease. Automatically detecting risk factors for heart disease in clinical notes can help with disease progression modeling and clinical decision-making. Many studies have attempted to detect risk factors for heart disease, but none have identified all risk factors. These studies have proposed hybrid systems that combine knowledge-driven and data-driven techniques, based on dictionaries, rules, and machine learning methods that require significant human effort. The National Center for Informatics for Integrating Biology and Beyond (i2b2) proposed a clinical natural language processing (NLP) challenge in 2014, with a track (track2) focused on detecting risk factors for heart disease risk factors in clinical notes over time. Clinical narratives provide a wealth of information that can be extracted using NLP and Deep Learning techniques. The objective of this paper is to improve on previous work in this area as part of the 2014 i2b2 challenge by identifying tags and attributes relevant to disease diagnosis, risk factors, and medications by providing advanced techniques of using stacked word embeddings. The i2b2 heart disease risk factors challenge dataset has shown significant improvement by using the approach of stacking embeddings, which combines various embeddings. Our model achieved an F1 score of 93.66% by using BERT and character embeddings (CHARACTER-BERT Embedding) stacking. The proposed model has significant results compared to all other models and systems that we developed for the 2014 i2b2 challenge.
Collapse
Affiliation(s)
- Essam H Houssein
- Faculty of Computers and Information, Minia University, Minia, Egypt.
| | - Rehab E Mohamed
- Faculty of Computers and Information, Minia University, Minia, Egypt
| | - Abdelmgeid A Ali
- Faculty of Computers and Information, Minia University, Minia, Egypt
| |
Collapse
|
62
|
Zhu Y, Li J, Kim J, Li S, Zhao Y, Bahari J, Eliahoo P, Li G, Kawakita S, Haghniaz R, Gao X, Falcone N, Ermis M, Kang H, Liu H, Kim H, Tabish T, Yu H, Li B, Akbari M, Emaminejad S, Khademhosseini A. Skin-interfaced electronics: A promising and intelligent paradigm for personalized healthcare. Biomaterials 2023; 296:122075. [PMID: 36931103 PMCID: PMC10085866 DOI: 10.1016/j.biomaterials.2023.122075] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 02/23/2023] [Accepted: 03/02/2023] [Indexed: 03/09/2023]
Abstract
Skin-interfaced electronics (skintronics) have received considerable attention due to their thinness, skin-like mechanical softness, excellent conformability, and multifunctional integration. Current advancements in skintronics have enabled health monitoring and digital medicine. Particularly, skintronics offer a personalized platform for early-stage disease diagnosis and treatment. In this comprehensive review, we discuss (1) the state-of-the-art skintronic devices, (2) material selections and platform considerations of future skintronics toward intelligent healthcare, (3) device fabrication and system integrations of skintronics, (4) an overview of the skintronic platform for personalized healthcare applications, including biosensing as well as wound healing, sleep monitoring, the assessment of SARS-CoV-2, and the augmented reality-/virtual reality-enhanced human-machine interfaces, and (5) current challenges and future opportunities of skintronics and their potentials in clinical translation and commercialization. The field of skintronics will not only minimize physical and physiological mismatches with the skin but also shift the paradigm in intelligent and personalized healthcare and offer unprecedented promise to revolutionize conventional medical practices.
Collapse
Affiliation(s)
- Yangzhi Zhu
- Terasaki Institute for Biomedical Innovation, Los Angeles, CA, 90064, United States.
| | - Jinghang Li
- Terasaki Institute for Biomedical Innovation, Los Angeles, CA, 90064, United States
| | - Jinjoo Kim
- Terasaki Institute for Biomedical Innovation, Los Angeles, CA, 90064, United States
| | - Shaopei Li
- Terasaki Institute for Biomedical Innovation, Los Angeles, CA, 90064, United States
| | - Yichao Zhao
- Interconnected and Integrated Bioelectronics Lab, Department of Electrical and Computer Engineering, and Materials Science and Engineering, University of California, Los Angeles, CA, 90095, United States
| | - Jamal Bahari
- Terasaki Institute for Biomedical Innovation, Los Angeles, CA, 90064, United States
| | - Payam Eliahoo
- Biomedical Engineering Department, University of Southern California, Los Angeles, CA, 90007, United States
| | - Guanghui Li
- The Centre of Nanoscale Science and Technology and Key Laboratory of Functional Polymer Materials, Institute of Polymer Chemistry, College of Chemistry, Nankai University, Tianjin, 300071, China; Renewable Energy Conversion and Storage Center (RECAST), Nankai University, Tianjin, 300071, China
| | - Satoru Kawakita
- Terasaki Institute for Biomedical Innovation, Los Angeles, CA, 90064, United States
| | - Reihaneh Haghniaz
- Terasaki Institute for Biomedical Innovation, Los Angeles, CA, 90064, United States
| | - Xiaoxiang Gao
- Department of Nanoengineering, University of California, San Diego, La Jolla, CA, 92093, United States
| | - Natashya Falcone
- Terasaki Institute for Biomedical Innovation, Los Angeles, CA, 90064, United States
| | - Menekse Ermis
- Terasaki Institute for Biomedical Innovation, Los Angeles, CA, 90064, United States
| | - Heemin Kang
- Department of Materials Science and Engineering, Korea University, Seoul, 02841, Republic of Korea
| | - Hao Liu
- Bioinspired Engineering and Biomechanics Center (BEBC), Xi'an Jiaotong University, Xi'an, 710049, PR China
| | - HanJun Kim
- Terasaki Institute for Biomedical Innovation, Los Angeles, CA, 90064, United States; College of Pharmacy, Korea University, Sejong, 30019, Republic of Korea
| | - Tanveer Tabish
- Division of Cardiovascular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, OX3 7BN, United Kingdom
| | - Haidong Yu
- Frontiers Science Center for Flexible Electronics, Xi'an Institute of Flexible Electronics (IFE) and Xi'an Institute of Biomedical Materials & Engineering, Northwestern Polytechnical University, Xi'an, 710072, PR China
| | - Bingbing Li
- Terasaki Institute for Biomedical Innovation, Los Angeles, CA, 90064, United States; Department of Manufacturing Systems Engineering and Management, California State University, Northridge, CA, 91330, United States
| | - Mohsen Akbari
- Terasaki Institute for Biomedical Innovation, Los Angeles, CA, 90064, United States; Laboratory for Innovation in Microengineering (LiME), Department of Mechanical Engineering, Center for Biomedical Research, University of Victoria, Victoria, BC V8P 2C5, Canada
| | - Sam Emaminejad
- Interconnected and Integrated Bioelectronics Lab, Department of Electrical and Computer Engineering, and Materials Science and Engineering, University of California, Los Angeles, CA, 90095, United States
| | - Ali Khademhosseini
- Terasaki Institute for Biomedical Innovation, Los Angeles, CA, 90064, United States.
| |
Collapse
|
63
|
Steiger E, Kroll LE. Patient Embeddings From Diagnosis Codes for Health Care Prediction Tasks: Pat2Vec Machine Learning Framework. JMIR AI 2023; 2:e40755. [PMID: 38875541 PMCID: PMC11041498 DOI: 10.2196/40755] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Revised: 12/09/2022] [Accepted: 03/18/2023] [Indexed: 06/16/2024]
Abstract
BACKGROUND In health care, diagnosis codes in claims data and electronic health records (EHRs) play an important role in data-driven decision making. Any analysis that uses a patient's diagnosis codes to predict future outcomes or describe morbidity requires a numerical representation of this diagnosis profile made up of string-based diagnosis codes. These numerical representations are especially important for machine learning models. Most commonly, binary-encoded representations have been used, usually for a subset of diagnoses. In real-world health care applications, several issues arise: patient profiles show high variability even when the underlying diseases are the same, they may have gaps and not contain all available information, and a large number of appropriate diagnoses must be considered. OBJECTIVE We herein present Pat2Vec, a self-supervised machine learning framework inspired by neural network-based natural language processing that embeds complete diagnosis profiles into a small real-valued numerical vector. METHODS Based on German outpatient claims data with diagnosis codes according to the International Statistical Classification of Diseases and Related Health Problems, 10th Revision (ICD-10), we discovered an optimal vectorization embedding model for patient diagnosis profiles with Bayesian optimization for the hyperparameters. The calibration process ensured a robust embedding model for health care-relevant tasks by aggregating the metrics of different regression and classification tasks using different machine learning algorithms (linear and logistic regression as well as gradient-boosted trees). The models were tested against a baseline model that binary encodes the most common diagnoses. The study used diagnosis profiles and supplementary data from more than 10 million patients from 2016 to 2019 and was based on the largest German ambulatory claims data set. To describe subpopulations in health care, we identified clusters (via density-based clustering) and visualized patient vectors in 2D (via dimensionality reduction with uniform manifold approximation). Furthermore, we applied our vectorization model to predict prospective drug prescription costs based on patients' diagnoses. RESULTS Our final models outperform the baseline model (binary encoding) with equal dimensions. They are more robust to missing data and show large performance gains, particularly in lower dimensions, demonstrating the embedding model's compression of nonlinear information. In the future, other sources of health care data can be integrated into the current diagnosis-based framework. Other researchers can apply our publicly shared embedding model to their own diagnosis data. CONCLUSIONS We envision a wide range of applications for Pat2Vec that will improve health care quality, including personalized prevention and signal detection in patient surveillance as well as health care resource planning based on subcohorts identified by our data-driven machine learning framework.
Collapse
Affiliation(s)
- Edgar Steiger
- Zi Data Science Lab, Department IT and Data Science, Central Research Institute of Ambulatory Health Care in Germany (Zi), Berlin, Germany
| | - Lars Eric Kroll
- Zi Data Science Lab, Department IT and Data Science, Central Research Institute of Ambulatory Health Care in Germany (Zi), Berlin, Germany
| |
Collapse
|
64
|
Lee YC, Jung SH, Kumar A, Shim I, Song M, Kim MS, Kim K, Myung W, Park WY, Won HH. ICD2Vec: Mathematical representation of diseases. J Biomed Inform 2023; 141:104361. [PMID: 37054960 DOI: 10.1016/j.jbi.2023.104361] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2022] [Revised: 03/31/2023] [Accepted: 04/05/2023] [Indexed: 04/15/2023]
Abstract
BACKGROUND The International Classification of Diseases (ICD) codes represent the global standard for reporting disease conditions. The current ICD codes connote direct human-defined relationships among diseases in a hierarchical tree structure. Representing the ICD codes as mathematical vectors helps to capture nonlinear relationships in medical ontologies across diseases. METHODS We propose a universally applicable framework called "ICD2Vec" designed to provide mathematical representations of diseases by encoding corresponding information. First, we present the arithmetical and semantic relationships between diseases by mapping composite vectors for symptoms or diseases to the most similar ICD codes. Second, we investigated the validity of ICD2Vec by comparing the biological relationships and cosine similarities among the vectorized ICD codes. Third, we propose a new risk score called IRIS, derived from ICD2Vec, and demonstrate its clinical utility with large cohorts from the UK and South Korea. RESULTS Semantic compositionality was qualitatively confirmed between descriptions of symptoms and ICD2Vec. For example, the most diseases most similar to COVID-19 were found to be the common cold (ICD-10: J00), unspecified viral hemorrhagic fever (ICD-10: A99), and smallpox (ICD-10: B03). We show the significant associations between the cosine similarities derived from ICD2Vec and the biological relationships using disease-to-disease pairs. Furthermore, we observed significant adjusted hazard ratios (HR) and area under the receiver operating characteristics (AUROC) between IRIS and risks for eight diseases. For instance, the higher IRIS for coronary artery disease (CAD) can be the higher probability for the incidence of CAD (HR: 2.15 [95% CI 2.02-2.28] and AUROC: 0.587 [95% CI 0.583-0.591]). We identified individuals at substantially increased risk of CAD using IRIS and 10-year atherosclerotic cardiovascular disease risk (adjusted HR, 4.26, 95% CI, 3.59-5.05). CONCLUSIONS ICD2Vec, a proposed universal framework for converting qualitatively measured ICD codes into quantitative vectors containing semantic relationships between diseases, exhibited a significant correlation with actual biological significance. In addition, the IRIS was a significant predictor of major diseases in a prospective study using two large-scale Biobank EHR datasets. Based on this clinical validity and utility evidence, we suggest that publicly available ICD2Vec can be used in diverse research and clinical practices and has important clinical implications.
Collapse
Affiliation(s)
- Yeong Chan Lee
- Department of Digital Health, Samsung Advanced Institute for Health Sciences & Technology (SAIHST), Sungkyunkwan University, Samsung Medical Center, Seoul, Republic of Korea
| | - Sang-Hyuk Jung
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA
| | - Aman Kumar
- Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal, India
| | - Injeong Shim
- Department of Digital Health, Samsung Advanced Institute for Health Sciences & Technology (SAIHST), Sungkyunkwan University, Samsung Medical Center, Seoul, Republic of Korea
| | - Minku Song
- Department of Digital Health, Samsung Advanced Institute for Health Sciences & Technology (SAIHST), Sungkyunkwan University, Samsung Medical Center, Seoul, Republic of Korea
| | - Min Seo Kim
- Department of Digital Health, Samsung Advanced Institute for Health Sciences & Technology (SAIHST), Sungkyunkwan University, Samsung Medical Center, Seoul, Republic of Korea
| | - Kyunga Kim
- Department of Digital Health, Samsung Advanced Institute for Health Sciences & Technology (SAIHST), Sungkyunkwan University, Samsung Medical Center, Seoul, Republic of Korea; Statistics and Data Center, Research Institute for Future Medicine, Samsung Medical Center, Seoul, Republic of Korea
| | - Woojae Myung
- Department of Neuropsychiatry, Seoul National University Bundang Hospital, Seongnam, Republic of Korea
| | - Woong-Yang Park
- Samsung Genome Institute, Samsung Medical Center, Seoul, Republic of Korea
| | - Hong-Hee Won
- Department of Digital Health, Samsung Advanced Institute for Health Sciences & Technology (SAIHST), Sungkyunkwan University, Samsung Medical Center, Seoul, Republic of Korea; Samsung Genome Institute, Samsung Medical Center, Seoul, Republic of Korea.
| |
Collapse
|
65
|
Bonde A, Bonde M, Troelsen A, Sillesen M. Assessing the utility of a sliding-windows deep neural network approach for risk prediction of trauma patients. Sci Rep 2023; 13:5176. [PMID: 36997598 PMCID: PMC10063587 DOI: 10.1038/s41598-023-32453-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 03/28/2023] [Indexed: 04/03/2023] Open
Abstract
The risks of post trauma complications are regulated by the injury, comorbidities, and the clinical trajectories, yet prediction models are often limited to single time-point data. We hypothesize that deep learning prediction models can be used for risk prediction using additive data after trauma using a sliding windows approach. Using the American College of Surgeons Trauma Quality Improvement Program (ACS TQIP) database, we developed three deep neural network models, for sliding-windows risk prediction. Output variables included early- and late mortality and any of 17 complications. As patients moved through the treatment trajectories, performance metrics increased. Models predicted early- and late mortality with ROC AUCs ranging from 0.980 to 0.994 and 0.910 to 0.972, respectively. For the remaining 17 complications, the mean performance ranged from 0.829 to 0.912. In summary, the deep neural networks achieved excellent performance in the sliding windows risk stratification of trauma patients.
Collapse
Affiliation(s)
- Alexander Bonde
- Department of Organ Surgery and Transplantation, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
- Center for Surgical Translational and Artificial Intelligence Research (CSTAR), Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | - Mikkel Bonde
- Department of Organ Surgery and Transplantation, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
- Center for Surgical Translational and Artificial Intelligence Research (CSTAR), Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | - Anders Troelsen
- Department of Orthopedics, Copenhagen University Hospital, Hvidovre, Denmark
| | - Martin Sillesen
- Department of Organ Surgery and Transplantation, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark.
- Center for Surgical Translational and Artificial Intelligence Research (CSTAR), Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark.
- Institute of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
66
|
Hulsen T, Friedecký D, Renz H, Melis E, Vermeersch P, Fernandez-Calle P. From big data to better patient outcomes. Clin Chem Lab Med 2023; 61:580-586. [PMID: 36539928 DOI: 10.1515/cclm-2022-1096] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Accepted: 12/12/2022] [Indexed: 12/24/2022]
Abstract
Among medical specialties, laboratory medicine is the largest producer of structured data and must play a crucial role for the efficient and safe implementation of big data and artificial intelligence in healthcare. The area of personalized therapies and precision medicine has now arrived, with huge data sets not only used for experimental and research approaches, but also in the "real world". Analysis of real world data requires development of legal, procedural and technical infrastructure. The integration of all clinical data sets for any given patient is important and necessary in order to develop a patient-centered treatment approach. Data-driven research comes with its own challenges and solutions. The Findability, Accessibility, Interoperability, and Reusability (FAIR) Guiding Principles provide guidelines to make data findable, accessible, interoperable and reusable to the research community. Federated learning, standards and ontologies are useful to improve robustness of artificial intelligence algorithms working on big data and to increase trust in these algorithms. When dealing with big data, the univariate statistical approach changes to multivariate statistical methods significantly shifting the potential of big data. Combining multiple omics gives previously unsuspected information and provides understanding of scientific questions, an approach which is also called the systems biology approach. Big data and artificial intelligence also offer opportunities for laboratories and the In Vitro Diagnostic industry to optimize the productivity of the laboratory, the quality of laboratory results and ultimately patient outcomes, through tools such as predictive maintenance and "moving average" based on the aggregate of patient results.
Collapse
Affiliation(s)
- Tim Hulsen
- Department of Hospital Services & Informatics, Philips Research, Eindhoven, The Netherlands
| | - David Friedecký
- Department of Clinical Biochemistry, Laboratory for Inherited Metabolic Disorders, University Hospital Olomouc and Faculty of Medicine and Dentistry, Palacký University in Olomouc, Olomouc, Czech Republic
| | - Harald Renz
- Institute of Laboratory Medicine, member of the German Center for Lung Research (DZL), and the Universities of Giessen and Marburg Lung Center (UGMLC), Philipps University Marburg, Marburg, Germany
- Department of Clinical Immunology and Allergy, Laboratory of Immunopathology, I.M. Sechenov First Moscow State Medical University (Sechenov University), Moscow, Russia
| | - Els Melis
- Ortho Clinical Diagnostics, Zaventem, Belgium
| | - Pieter Vermeersch
- Clinical Department of Laboratory Medicine, University Hospitals Leuven, Leuven, Belgium
- Department of Cardiovascular Sciences, KU Leuven, Leuven, Belgium
- European Federation of Clinical Chemistry and Laboratory Medicine (EFLM), Milan, Italy
| | - Pilar Fernandez-Calle
- European Federation of Clinical Chemistry and Laboratory Medicine (EFLM), Milan, Italy
- Department of Laboratory Medicine, Hospital Universitario La Paz, Madrid, Spain
| |
Collapse
|
67
|
Kresoja KP, Unterhuber M, Wachter R, Thiele H, Lurz P. A cardiologist's guide to machine learning in cardiovascular disease prognosis prediction. Basic Res Cardiol 2023; 118:10. [PMID: 36939941 PMCID: PMC10027799 DOI: 10.1007/s00395-023-00982-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 02/21/2023] [Accepted: 02/26/2023] [Indexed: 03/21/2023]
Abstract
A modern-day physician is faced with a vast abundance of clinical and scientific data, by far surpassing the capabilities of the human mind. Until the last decade, advances in data availability have not been accompanied by analytical approaches. The advent of machine learning (ML) algorithms might improve the interpretation of complex data and should help to translate the near endless amount of data into clinical decision-making. ML has become part of our everyday practice and might even further change modern-day medicine. It is important to acknowledge the role of ML in prognosis prediction of cardiovascular disease. The present review aims on preparing the modern physician and researcher for the challenges that ML might bring, explaining basic concepts but also caveats that might arise when using these methods. Further, a brief overview of current established classical and emerging concepts of ML disease prediction in the fields of omics, imaging and basic science is presented.
Collapse
Affiliation(s)
- Karl-Patrik Kresoja
- Department of Internal Medicine/Cardiology, Heart Center Leipzig at University of Leipzig, Struempellstr. 39, 04289, Leipzig, Germany
- Leipzig Heart Institute, Leipzig Heart Science at Heart Center Leipzig, Leipzig, Germany
| | - Matthias Unterhuber
- Department of Internal Medicine/Cardiology, Heart Center Leipzig at University of Leipzig, Struempellstr. 39, 04289, Leipzig, Germany
- Leipzig Heart Institute, Leipzig Heart Science at Heart Center Leipzig, Leipzig, Germany
| | - Rolf Wachter
- Department of Cardiology, University Hospital Leipzig, Leipzig, Germany
- Clinic for Cardiology and Pneumology, University Medicine Göttingen, Göttingen, Germany
- German Cardiovascular Research Center (DZHK), Partner Site Göttingen, Göttingen, Germany
| | - Holger Thiele
- Department of Internal Medicine/Cardiology, Heart Center Leipzig at University of Leipzig, Struempellstr. 39, 04289, Leipzig, Germany.
- Leipzig Heart Institute, Leipzig Heart Science at Heart Center Leipzig, Leipzig, Germany.
| | - Philipp Lurz
- Department of Internal Medicine/Cardiology, Heart Center Leipzig at University of Leipzig, Struempellstr. 39, 04289, Leipzig, Germany.
- Leipzig Heart Institute, Leipzig Heart Science at Heart Center Leipzig, Leipzig, Germany.
| |
Collapse
|
68
|
Cho YS, Kim E, Stafford PL, Oh MH, Kwon Y. Identifying Disease of Interest With Deep Learning Using Diagnosis Code. J Korean Med Sci 2023; 38:e77. [PMID: 36942391 PMCID: PMC10027541 DOI: 10.3346/jkms.2023.38.e77] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 12/18/2022] [Indexed: 03/08/2023] Open
Abstract
BACKGROUND Autoencoder (AE) is one of the deep learning techniques that uses an artificial neural network to reconstruct its input data in the output layer. We constructed a novel supervised AE model and tested its performance in the prediction of a co-existence of the disease of interest only using diagnostic codes. METHODS Diagnostic codes of one million randomly sampled patients listed in the Korean National Health Information Database in 2019 were used to train, validate, and test the prediction model. The first used AE solely for a feature engineering tool for an input of a classifier. Supervised Multi-Layer Perceptron (sMLP) was added to train a classifier to predict a binary level with latent representation as an input (AE + sMLP). The second model simultaneously updated the parameters in the AE and the connected MLP classifier during the learning process (End-to-End Supervised AE [EEsAE]). We tested the performances of these two models against baseline models, eXtreme Gradient Boosting (XGB) and naïve Bayes, in the prediction of co-existing gastric cancer diagnosis. RESULTS The proposed EEsAE model yielded the highest F1-score and highest area under the curve (0.86). The EEsAE and AE + sMLP gave the highest recalls. XGB yielded the highest precision. Ablation study revealed that iron deficiency anemia, gastroesophageal reflux disease, essential hypertension, gastric ulcers, benign prostate hyperplasia, and shoulder lesion were the top 6 most influential diagnoses on performance. CONCLUSION A novel EEsAE model showed promising performance in the prediction of a disease of interest.
Collapse
Affiliation(s)
- Yoon-Sik Cho
- Department of Artificial Intelligence, Chung-Ang University, Seoul, Korea.
| | - Eunsun Kim
- Department of Data Science, Sejong University, Seoul, Korea
| | - Patrick L Stafford
- Department of Medicine, University of Virginia, Charlottesville, VA, USA
| | - Min-Hwan Oh
- Graduate School of Data Science, Seoul National University, Seoul, Korea
| | - Younghoon Kwon
- Department of Medicine, University of Washington, Seattle, WA, USA
| |
Collapse
|
69
|
Abstract
Laboratory clinical decision support (CDS) typically relies on data from the electronic health record (EHR). The implementation of a sustainable, effective laboratory CDS program requires a commitment to standardization and harmonization of key EHR data elements that are the foundation of laboratory CDS. The direct use of artificial intelligence algorithms in CDS programs will be limited unless key elements of the EHR are structured. The identification, curation, maintenance, and preprocessing steps necessary to implement robust laboratory-based algorithms must account for the heterogeneity of data present in a typical EHR.
Collapse
|
70
|
Kashyap A, Callison-Burch C, Boland MR. A deep learning method to detect opioid prescription and opioid use disorder from electronic health records. Int J Med Inform 2023; 171:104979. [PMID: 36621078 PMCID: PMC9898169 DOI: 10.1016/j.ijmedinf.2022.104979] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 12/12/2022] [Accepted: 12/27/2022] [Indexed: 01/01/2023]
Abstract
OBJECTIVE As the opioid epidemic continues across the United States, methods are needed to accurately and quickly identify patients at risk for opioid use disorder (OUD). The purpose of this study is to develop two predictive algorithms: one to predict opioid prescription and one to predict OUD. MATERIALS AND METHODS We developed an informatics algorithm that trains two deep learning models over patient Electronic Health Records (EHRs) using the MIMIC-III database. We utilize both the structured and unstructured parts of the EHR and show that it is possible to predict both challenging outcomes. RESULTS Our deep learning models incorporate elements from EHRs to predict opioid prescription with an F1-score of 0.88 ± 0.003 and an AUC-ROC of 0.93 ± 0.002. We also constructed a model to predict OUD diagnosis achieving an F1-score of 0.82 ± 0.05 and AUC-ROC of 0.94 ± 0.008. DISCUSSION Our model for OUD prediction outperformed prior algorithms for specificity, F1 score and AUC-ROC while achieving equivalent sensitivity. This demonstrates the importance of a) deep learning approaches in predicting OUD and b) incorporating both structured and unstructured data for this prediction task. No prediction models for opioid prescription as an outcome were found in the literature and therefore our model is the first to predict opioid prescribing behavior. CONCLUSION Algorithms such as those described in this paper will become increasingly important to understand the drivers underlying this national epidemic.
Collapse
Affiliation(s)
- Aditya Kashyap
- Department of Computer Science, University of Pennsylvania, United States of America
| | - Chris Callison-Burch
- Department of Computer Science, University of Pennsylvania, United States of America
| | - Mary Regina Boland
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, United States of America; Institute for Biomedical Informatics, University of Pennsylvania, United States of America; Center for Excellence in Environmental Toxicology, University of Pennsylvania, United States of America; Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, United States of America.
| |
Collapse
|
71
|
Chekani F, Zhu Z, Khandker RK, Ai J, Meng W, Holler E, Dexter P, Boustani M, Ben Miled Z. Modeling acute care utilization: practical implications for insomnia patients. Sci Rep 2023; 13:2185. [PMID: 36750631 PMCID: PMC9905481 DOI: 10.1038/s41598-023-29366-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 02/03/2023] [Indexed: 02/09/2023] Open
Abstract
Machine learning models can help improve health care services. However, they need to be practical to gain wide-adoption. In this study, we investigate the practical utility of different data modalities and cohort segmentation strategies when designing models for emergency department (ED) and inpatient hospital (IH) visits. The data modalities include socio-demographics, diagnosis and medications. Segmentation compares a cohort of insomnia patients to a cohort of general non-insomnia patients under varying age and disease severity criteria. Transfer testing between the two cohorts is introduced to demonstrate that an insomnia-specific model is not necessary when predicting future ED visits, but may have merit when predicting IH visits especially for patients with an insomnia diagnosis. The results also indicate that using both diagnosis and medications as a source of data does not generally improve model performance and may increase its overhead. Based on these findings, the proposed evaluation methodologies are recommended to ascertain the utility of disease-specific models in addition to the traditional intra-cohort testing.
Collapse
Affiliation(s)
| | - Zitong Zhu
- Computer Science, IUPUI, Indianapolis, IN, 46202, USA
| | | | - Jizhou Ai
- Merck & Co., Inc., Rahway, NJ, 07065, USA
| | | | - Emma Holler
- School of Public Health, Indiana University, Bloomington, IN, 47405, USA
| | - Paul Dexter
- School of Medicine, Indiana University, Indianapolis, IN, 46202, USA.,Regenstrief Institute, Indianapolis, IN, 46202, USA
| | - Malaz Boustani
- School of Medicine, Indiana University, Indianapolis, IN, 46202, USA.,Regenstrief Institute, Indianapolis, IN, 46202, USA
| | - Zina Ben Miled
- Regenstrief Institute, Indianapolis, IN, 46202, USA. .,Electrical and Computer Engineering, IUPUI, Indianapolis, IN, 46202, USA.
| |
Collapse
|
72
|
de Capretz PO, Björkelund A, Björk J, Ohlsson M, Mokhtari A, Nyström A, Ekelund U. Machine learning for early prediction of acute myocardial infarction or death in acute chest pain patients using electrocardiogram and blood tests at presentation. BMC Med Inform Decis Mak 2023; 23:25. [PMID: 36732708 PMCID: PMC9896766 DOI: 10.1186/s12911-023-02119-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 01/23/2023] [Indexed: 02/04/2023] Open
Abstract
AIMS In the present study, we aimed to evaluate the performance of machine learning (ML) models for identification of acute myocardial infarction (AMI) or death within 30 days among emergency department (ED) chest pain patients. METHODS AND RESULTS Using data from 9519 consecutive ED chest pain patients, we created ML models based on logistic regression or artificial neural networks. Model inputs included sex, age, ECG and the first blood tests at patient presentation: High sensitivity TnT (hs-cTnT), glucose, creatinine, and hemoglobin. For a safe rule-out, the models were adapted to achieve a sensitivity > 99% and a negative predictive value (NPV) > 99.5% for 30-day AMI/death. For rule-in, we set the models to achieve a specificity > 90% and a positive predictive value (PPV) of > 70%. The models were also compared with the 0 h arm of the European Society of Cardiology algorithm (ESC 0 h); An initial hs-cTnT < 5 ng/L for rule-out and ≥ 52 ng/L for rule-in. A convolutional neural network was the best model and identified 55% of the patients for rule-out and 5.3% for rule-in, while maintaining the required sensitivity, specificity, NPV and PPV levels. ESC 0 h failed to reach these performance levels. DISCUSSION An ML model based on age, sex, ECG and blood tests at ED arrival can identify six out of ten chest pain patients for safe early rule-out or rule-in with no need for serial blood tests. Future studies should attempt to improve these ML models further, e.g. by including additional input data.
Collapse
Affiliation(s)
- Pontus Olsson de Capretz
- grid.411843.b0000 0004 0623 9987Department of Internal and Emergency Medicine, Skåne University Hospital, Klinikgatan 15, 221 85 Lund, Sweden ,grid.4514.40000 0001 0930 2361Department of Clinical Sciences, Lund University, Lund, Sweden
| | - Anders Björkelund
- grid.4514.40000 0001 0930 2361Department of Astronomy and Theoretical Physics, Lund University, Lund, Sweden
| | - Jonas Björk
- grid.4514.40000 0001 0930 2361Division of Occupational and Environmental Medicine, Lund University, Lund, Sweden ,grid.411843.b0000 0004 0623 9987Clinical Studies Sweden, Forum South, Skåne University Hospital, Lund, Sweden
| | - Mattias Ohlsson
- grid.4514.40000 0001 0930 2361Department of Astronomy and Theoretical Physics, Lund University, Lund, Sweden ,grid.73638.390000 0000 9852 2034Center for Applied Intelligent Systems Research (CAISR), Halmstad University, Halmstad, Sweden
| | - Arash Mokhtari
- grid.411843.b0000 0004 0623 9987Department of Cardiology, Skåne University Hospital, Lund, Sweden ,grid.4514.40000 0001 0930 2361Department of Clinical Sciences, Lund University, Lund, Sweden
| | - Axel Nyström
- grid.4514.40000 0001 0930 2361Division of Occupational and Environmental Medicine, Lund University, Lund, Sweden
| | - Ulf Ekelund
- grid.411843.b0000 0004 0623 9987Department of Internal and Emergency Medicine, Skåne University Hospital, Klinikgatan 15, 221 85 Lund, Sweden ,grid.4514.40000 0001 0930 2361Department of Clinical Sciences, Lund University, Lund, Sweden
| |
Collapse
|
73
|
Lu M, Zhang Y, Zhang S, Shi H, Huang Z. Knowledge-aware patient representation learning for multiple disease subtypes. J Biomed Inform 2023; 138:104292. [PMID: 36641030 DOI: 10.1016/j.jbi.2023.104292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Accepted: 01/10/2023] [Indexed: 01/13/2023]
Abstract
Learning latent representations of patients with a target disease is a core problem in a broad range of downstream applications, such as clinical endpoint prediction. The suffering of patients may have multiple subtypes with certain similarities and differences, which need to be addressed for learning effective patient representation to facilitate the downstream tasks. However, existing studies either ignore the distinction of disease subtypes to learn disease-level representations, or neglect the correlations between subtypes and only learn disease subtype-level representations, which affects the performance of patient representation learning. To alleviate this problem, we studied how to effectively integrate data from all disease subtypes to improve the representation of each subtype. Specifically, we proposed a knowledge-aware shared-private neural network model to explicitly use disease-oriented knowledge and learn shared and specific representations from the disease and its subtype perspectives. To evaluate the feasibility of the proposed model, we conducted a particular downstream task, i.e., clinical endpoint prediction, on the basis of the learned patient presentations. The results on the real-world clinical datasets demonstrated that our model could yield a significant improvement over state-of-the-art models.
Collapse
Affiliation(s)
- Menglin Lu
- College of Computer Science and Technology, Zhejiang University, 866 Yuhangtang Road, 310058 Hangzhou, People's Republic of China.
| | - Yujie Zhang
- College of Computer Science and Technology, Zhejiang University, 866 Yuhangtang Road, 310058 Hangzhou, People's Republic of China.
| | - Suixia Zhang
- College of Computer Science and Technology, Zhejiang University, 866 Yuhangtang Road, 310058 Hangzhou, People's Republic of China.
| | - Hanrui Shi
- College of Computer Science and Technology, Zhejiang University, 866 Yuhangtang Road, 310058 Hangzhou, People's Republic of China.
| | - Zhengxing Huang
- College of Computer Science and Technology, Zhejiang University, 866 Yuhangtang Road, 310058 Hangzhou, People's Republic of China.
| |
Collapse
|
74
|
Anish TP, Joe Prathap PM. An efficient and low complex model for optimal RBM features with weighted score-based ensemble multi-disease prediction. Comput Methods Biomech Biomed Engin 2023; 26:350-372. [PMID: 36218238 DOI: 10.1080/10255842.2022.2129969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Multi-disease prediction is regarded as the capacity to simultaneously identify various diseases that are expected to be affected an individual at a certain period. These multiple diseases are seemed to be at various progression levels and need to be detected in the patient at the time of clinical visits. Diverse studies in the literature have included the predictive models for particular diseases yet, it is unable to notice humans with multiple diseases since humans are mostly suffered not only from a single disease but also from multiple diseases. Hence, this article aims to implement a novel multi-disease prediction model using an ensemble learning approach with deep features. The required data for the multi-disease prediction is collected from the standard datasets. Then, the collected data are given into the "Deep Belief Network (DBN)" approach, where the features are obtained from the RBM layers. These RBM features are tuned with the help of Deviation-based Hybrid Grasshopper Barnacles Mating Optimization (D-HGBMO) for improving the prediction performance. The optimized RBM features are considered in the ensemble learning model named Ensemble, in which the multi-disease prediction is performed with "Deep Neural Network (DNN), Extreme Learning Machine (ELM), and Long Short Term Memory." The predicted score from three classifiers is used in the optimized weighted score and thresholding-based final prediction using the same D-HGBMO for determining the accurate multi-disease prediction results. The experimental results show the effective performance of the proposed model by comparing it with the existing classifiers with the help of different quantitative measures.
Collapse
Affiliation(s)
- T P Anish
- Assistant Professor, Department of Computer Science and Engineering, R.M.K. College of Engineering and Technology, Puduvoyal, India
| | - P M Joe Prathap
- Professor, Department of Computer Science and Engineering, R.M.D. Engineering College, Kavaraipettai, India
| |
Collapse
|
75
|
Fritzsche MC, Akyüz K, Cano Abadía M, McLennan S, Marttinen P, Mayrhofer MT, Buyx AM. Ethical layering in AI-driven polygenic risk scores-New complexities, new challenges. Front Genet 2023; 14:1098439. [PMID: 36816027 PMCID: PMC9933509 DOI: 10.3389/fgene.2023.1098439] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 01/04/2023] [Indexed: 01/27/2023] Open
Abstract
Researchers aim to develop polygenic risk scores as a tool to prevent and more effectively treat serious diseases, disorders and conditions such as breast cancer, type 2 diabetes mellitus and coronary heart disease. Recently, machine learning techniques, in particular deep neural networks, have been increasingly developed to create polygenic risk scores using electronic health records as well as genomic and other health data. While the use of artificial intelligence for polygenic risk scores may enable greater accuracy, performance and prediction, it also presents a range of increasingly complex ethical challenges. The ethical and social issues of many polygenic risk score applications in medicine have been widely discussed. However, in the literature and in practice, the ethical implications of their confluence with the use of artificial intelligence have not yet been sufficiently considered. Based on a comprehensive review of the existing literature, we argue that this stands in need of urgent consideration for research and subsequent translation into the clinical setting. Considering the many ethical layers involved, we will first give a brief overview of the development of artificial intelligence-driven polygenic risk scores, associated ethical and social implications, challenges in artificial intelligence ethics, and finally, explore potential complexities of polygenic risk scores driven by artificial intelligence. We point out emerging complexity regarding fairness, challenges in building trust, explaining and understanding artificial intelligence and polygenic risk scores as well as regulatory uncertainties and further challenges. We strongly advocate taking a proactive approach to embedding ethics in research and implementation processes for polygenic risk scores driven by artificial intelligence.
Collapse
Affiliation(s)
- Marie-Christine Fritzsche
- Institute of History and Ethics in Medicine, TUM School of Medicine, Technical University of Munich, Munich, Germany,Department of Science, Technology and Society (STS), School of Social Sciences and Technology, Technical University of Munich, Munich, Germany,*Correspondence: Marie-Christine Fritzsche,
| | - Kaya Akyüz
- Biobanking and Biomolecular Resources Research Infrastructure Consortium - European Research Infrastructure Consortium (BBMRI-ERIC), Graz, Austria,Department of Science and Technology Studies, University of Vienna, Vienna, Austria
| | - Mónica Cano Abadía
- Biobanking and Biomolecular Resources Research Infrastructure Consortium - European Research Infrastructure Consortium (BBMRI-ERIC), Graz, Austria
| | - Stuart McLennan
- Institute of History and Ethics in Medicine, TUM School of Medicine, Technical University of Munich, Munich, Germany,Department of Science, Technology and Society (STS), School of Social Sciences and Technology, Technical University of Munich, Munich, Germany
| | - Pekka Marttinen
- Helsinki Institute for Information Technology HIIT, Aalto University, Helsinki, Finland
| | - Michaela Th. Mayrhofer
- Biobanking and Biomolecular Resources Research Infrastructure Consortium - European Research Infrastructure Consortium (BBMRI-ERIC), Graz, Austria
| | - Alena M. Buyx
- Institute of History and Ethics in Medicine, TUM School of Medicine, Technical University of Munich, Munich, Germany,Department of Science, Technology and Society (STS), School of Social Sciences and Technology, Technical University of Munich, Munich, Germany
| |
Collapse
|
76
|
Abstract
Advancements in high-throughput sequencing have yielded vast amounts of genomic data, which are studied using genome-wide association study (GWAS)/phenome-wide association study (PheWAS) methods to identify associations between the genotype and phenotype. The associated findings have contributed to pharmacogenomics and improved clinical decision support at the point of care in many healthcare systems. However, the accumulation of genomic data from sequencing and clinical data from electronic health records (EHRs) poses significant challenges for data scientists. Following the rise of artificial intelligence (AI) technology such as machine learning and deep learning, an increasing number of GWAS/PheWAS studies have successfully leveraged this technology to overcome the aforementioned challenges. In this review, we focus on the application of data science and AI technology in three areas, including risk prediction and identification of causal single-nucleotide polymorphisms, EHR-based phenotyping and CRISPR guide RNA design. Additionally, we highlight a few emerging AI technologies, such as transfer learning and multi-view learning, which will or have started to benefit genomic studies.
Collapse
Affiliation(s)
- Jing Lin
- NUHS Corporate Office, National University Health System, Singapore
| | - Kee Yuan Ngiam
- NUHS Corporate Office, National University Health System, Singapore,Department of Surgery, National University of Singapore, Singapore,Correspondence: A/Prof Kee Yuan Ngiam, Group Chief Technology Officer, NUHS Corporate Office, National University Health System, 1E Kent Ridge Road, 119228, Singapore. E-mail:
| |
Collapse
|
77
|
Shaikh AK, Alhashmi SM, Khalique N, Khedr AM, Raahemifar K, Bukhari S. Bibliometric analysis on the adoption of artificial intelligence applications in the e-health sector. Digit Health 2023; 9:20552076221149296. [PMID: 36683951 PMCID: PMC9850136 DOI: 10.1177/20552076221149296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Accepted: 12/18/2022] [Indexed: 01/19/2023] Open
Abstract
Artificial Intelligent (AI) applications in e-health have evolved considerably in the last 25 years. To track the current research progress in this field, there is a need to analyze the most recent trend of adopting AI applications in e-health. This bibliometric analysis study covers AI applications in e-health. It differs from the existing literature review as the journal articles are obtained from the Scopus database from its beginning to late 2021 (25 years), which depicts the most recent trend of AI in e-health. The bibliometric analysis is employed to find the statistical and quantitative analysis of available literature of a specific field of study for a particular period. An extensive global literature review is performed to identify the significant research area, authors, or their relationship through published articles. It also provides the researchers with an overview of the work evolution of specific research fields. The study's main contribution highlights the essential authors, journals, institutes, keywords, and states in developing the AI field in e-health.
Collapse
Affiliation(s)
| | - Saadat M Alhashmi
- Department of Information Systems, College of Computing and
Informatics, University of
Sharjah, Sharjah, United Arab
Emirates,Saadat M Alhashmi, University of Sharjah,
College of Computing and Informatics, College of Computing and Informatics,
University of Sharjah, Sharjah 27272, United Arab Emirates.
| | - Nadia Khalique
- College of
Economics and Political Science, Sultan Qaboos
University, Muscat, Oman
| | - Ahmed M. Khedr
- Department of Information Systems, College of Computing and
Informatics, University of
Sharjah, Sharjah, United Arab
Emirates
| | | | - Sadaf Bukhari
- Beijing
Institute of Technology, Beijing, Beijing,
China
| |
Collapse
|
78
|
Kumar K, Kumar P, Deb D, Unguresan ML, Muresan V. Artificial Intelligence and Machine Learning Based Intervention in Medical Infrastructure: A Review and Future Trends. Healthcare (Basel) 2023; 11:healthcare11020207. [PMID: 36673575 PMCID: PMC9859198 DOI: 10.3390/healthcare11020207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 01/01/2023] [Accepted: 01/04/2023] [Indexed: 01/13/2023] Open
Abstract
People in the life sciences who work with Artificial Intelligence (AI) and Machine Learning (ML) are under increased pressure to develop algorithms faster than ever. The possibility of revealing innovative insights and speeding breakthroughs lies in using large datasets integrated on several levels. However, even if there is more data at our disposal than ever, only a meager portion is being filtered, interpreted, integrated, and analyzed. The subject of this technology is the study of how computers may learn from data and imitate human mental processes. Both an increase in the learning capacity and the provision of a decision support system at a size that is redefining the future of healthcare are enabled by AI and ML. This article offers a survey of the uses of AI and ML in the healthcare industry, with a particular emphasis on clinical, developmental, administrative, and global health implementations to support the healthcare infrastructure as a whole, along with the impact and expectations of each component of healthcare. Additionally, possible future trends and scopes of the utilization of this technology in medical infrastructure have also been discussed.
Collapse
Affiliation(s)
- Kamlesh Kumar
- Department of Electrical and Computer Science Engineering, Institute of Infrastructure Technology Research And Management, Ahmedabad 380026, India
| | - Prince Kumar
- Department of Electrical and Computer Science Engineering, Institute of Infrastructure Technology Research And Management, Ahmedabad 380026, India
| | - Dipankar Deb
- Department of Electrical and Computer Science Engineering, Institute of Infrastructure Technology Research And Management, Ahmedabad 380026, India
- Correspondence:
| | | | - Vlad Muresan
- Department of Automation, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, Romania
| |
Collapse
|
79
|
Rathkopf C, Heinrichs B. Learning to Live with Strange Error: Beyond Trustworthiness in Artificial Intelligence Ethics. Camb Q Healthc Ethics 2023:1-13. [PMID: 36621773 DOI: 10.1017/s0963180122000688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Position papers on artificial intelligence (AI) ethics are often framed as attempts to work out technical and regulatory strategies for attaining what is commonly called trustworthy AI. In such papers, the technical and regulatory strategies are frequently analyzed in detail, but the concept of trustworthy AI is not. As a result, it remains unclear. This paper lays out a variety of possible interpretations of the concept and concludes that none of them is appropriate. The central problem is that, by framing the ethics of AI in terms of trustworthiness, we reinforce unjustified anthropocentric assumptions that stand in the way of clear analysis. Furthermore, even if we insist on a purely epistemic interpretation of the concept, according to which trustworthiness just means measurable reliability, it turns out that the analysis will, nevertheless, suffer from a subtle form of anthropocentrism. The paper goes on to develop the concept of strange error, which serves both to sharpen the initial diagnosis of the inadequacy of trustworthy AI and to articulate the novel epistemological situation created by the use of AI. The paper concludes with a discussion of how strange error puts pressure on standard practices of assessing moral culpability, particularly in the context of medicine.
Collapse
Affiliation(s)
| | - Bert Heinrichs
- INM-7, Forschungszentrum Jülich GmbH, Jülich, Germany
- The Institute for Science and Ethics (IWE) The University of Bonn Bonner Talweg 57, 53113, Germany
| |
Collapse
|
80
|
Jujjavarapu C, Suri P, Pejaver V, Friedly J, Gold LS, Meier E, Cohen T, Mooney SD, Heagerty PJ, Jarvik JG. Predicting decompression surgery by applying multimodal deep learning to patients' structured and unstructured health data. BMC Med Inform Decis Mak 2023; 23:2. [PMID: 36609379 PMCID: PMC9824905 DOI: 10.1186/s12911-022-02096-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Accepted: 12/29/2022] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND Low back pain (LBP) is a common condition made up of a variety of anatomic and clinical subtypes. Lumbar disc herniation (LDH) and lumbar spinal stenosis (LSS) are two subtypes highly associated with LBP. Patients with LDH/LSS are often started with non-surgical treatments and if those are not effective then go on to have decompression surgery. However, recommendation of surgery is complicated as the outcome may depend on the patient's health characteristics. We developed a deep learning (DL) model to predict decompression surgery for patients with LDH/LSS. MATERIALS AND METHOD We used datasets of 8387 and 8620 patients from a prospective study that collected data from four healthcare systems to predict early (within 2 months) and late surgery (within 12 months after a 2 month gap), respectively. We developed a DL model to use patients' demographics, diagnosis and procedure codes, drug names, and diagnostic imaging reports to predict surgery. For each prediction task, we evaluated the model's performance using classical and generalizability evaluation. For classical evaluation, we split the data into training (80%) and testing (20%). For generalizability evaluation, we split the data based on the healthcare system. We used the area under the curve (AUC) to assess performance for each evaluation. We compared results to a benchmark model (i.e. LASSO logistic regression). RESULTS For classical performance, the DL model outperformed the benchmark model for early surgery with an AUC of 0.725 compared to 0.597. For late surgery, the DL model outperformed the benchmark model with an AUC of 0.655 compared to 0.635. For generalizability performance, the DL model outperformed the benchmark model for early surgery. For late surgery, the benchmark model outperformed the DL model. CONCLUSIONS For early surgery, the DL model was preferred for classical and generalizability evaluation. However, for late surgery, the benchmark and DL model had comparable performance. Depending on the prediction task, the balance of performance may shift between DL and a conventional ML method. As a result, thorough assessment is needed to quantify the value of DL, a relatively computationally expensive, time-consuming and less interpretable method.
Collapse
Affiliation(s)
- Chethan Jujjavarapu
- Department of Biomedical Informatics and Medical Education, School of Medicine, University of Washington, Box 358047, Seattle, WA, 98195, USA
| | - Pradeep Suri
- Clinical Learning, Evidence and Research Center, University of Washington, 4333 Brooklyn Ave NE, Seattle, WA, 98105, USA
- Department of Rehabilitation Medicine, University of Washington, 1959 NE Pacific St, Seattle, WA, 98195, USA
| | - Vikas Pejaver
- Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Janna Friedly
- Clinical Learning, Evidence and Research Center, University of Washington, 4333 Brooklyn Ave NE, Seattle, WA, 98105, USA
- Department of Rehabilitation Medicine, University of Washington, 1959 NE Pacific St, Seattle, WA, 98195, USA
| | - Laura S Gold
- Clinical Learning, Evidence and Research Center, University of Washington, 4333 Brooklyn Ave NE, Seattle, WA, 98105, USA
- Department of Radiology, University of Washington, 1959 NE Pacific Street, Seattle, WA, 98195, USA
| | - Eric Meier
- Clinical Learning, Evidence and Research Center, University of Washington, 4333 Brooklyn Ave NE, Seattle, WA, 98105, USA
- Department of Biostatistics, University of Washington, Box 357232, Seattle, WA, 98195-7232, USA
- Center for Biomedical Statistics, University of Washington, Seattle, WA, USA
| | - Trevor Cohen
- Department of Biomedical Informatics and Medical Education, School of Medicine, University of Washington, Box 358047, Seattle, WA, 98195, USA
| | - Sean D Mooney
- Department of Biomedical Informatics and Medical Education, School of Medicine, University of Washington, Box 358047, Seattle, WA, 98195, USA
| | - Patrick J Heagerty
- Department of Biostatistics, University of Washington, Box 357232, Seattle, WA, 98195-7232, USA
- Center for Biomedical Statistics, University of Washington, Seattle, WA, USA
| | - Jeffrey G Jarvik
- Clinical Learning, Evidence and Research Center, University of Washington, 4333 Brooklyn Ave NE, Seattle, WA, 98105, USA.
- Department of Radiology, University of Washington, 1959 NE Pacific Street, Seattle, WA, 98195, USA.
- Department of Neurological Surgery, University of Washington, 1959 NE Pacific Street, Seattle, WA, 98195, USA.
- Department of Health Services, University of Washington, Box 357660, Seattle, WA, 98195-7660, USA.
| |
Collapse
|
81
|
Zaballa O, Pérez A, Gómez Inhiesto E, Acaiturri Ayesta T, Lozano JA. Learning the progression patterns of treatments using a probabilistic generative model. J Biomed Inform 2023; 137:104271. [PMID: 36529347 DOI: 10.1016/j.jbi.2022.104271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Revised: 11/18/2022] [Accepted: 12/09/2022] [Indexed: 12/16/2022]
Abstract
Modeling a disease or the treatment of a patient has drawn much attention in recent years due to the vast amount of information that Electronic Health Records contain. This paper presents a probabilistic generative model of treatments that are described in terms of sequences of medical activities of variable length. The main objective is to identify distinct subtypes of treatments for a given disease, and discover their development and progression. To this end, the model considers that a sequence of actions has an associated hierarchical structure of latent variables that both classifies the sequences based on their evolution over time, and segments the sequences into different progression stages. The learning procedure of the model is performed with the Expectation-Maximization algorithm which considers the exponential number of configurations of the latent variables and is efficiently solved with a method based on dynamic programming. The evaluation of the model is twofold: first, we use synthetic data to demonstrate that the learning procedure allows the generative model underlying the data to be recovered; we then further assess the potential of our model to provide treatment classification and staging information in real-world data. Our model can be seen as a tool for classification, simulation, data augmentation and missing data imputation.
Collapse
Affiliation(s)
- Onintze Zaballa
- BCAM-Basque Center for Applied Mathematics, Bilbao 48009, Spain.
| | - Aritz Pérez
- BCAM-Basque Center for Applied Mathematics, Bilbao 48009, Spain.
| | | | | | - Jose A Lozano
- BCAM-Basque Center for Applied Mathematics, Bilbao 48009, Spain; Intelligent Systems Group, Department of Computer Science and Artificial Intelligence, University of the Basque Country UPV/EHU, Donostia 20018, Spain.
| |
Collapse
|
82
|
Del Casale A, Sarli G, Bargagna P, Polidori L, Alcibiade A, Zoppi T, Borro M, Gentile G, Zocchi C, Ferracuti S, Preissner R, Simmaco M, Pompili M. Machine Learning and Pharmacogenomics at the Time of Precision Psychiatry. Curr Neuropharmacol 2023; 21:2395-2408. [PMID: 37559539 PMCID: PMC10616924 DOI: 10.2174/1570159x21666230808170123] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 12/01/2022] [Accepted: 12/06/2022] [Indexed: 08/11/2023] Open
Abstract
Traditional medicine and biomedical sciences are reaching a turning point because of the constantly growing impact and volume of Big Data. Machine Learning (ML) techniques and related algorithms play a central role as diagnostic, prognostic, and decision-making tools in this field. Another promising area becoming part of everyday clinical practice is personalized therapy and pharmacogenomics. Applying ML to pharmacogenomics opens new frontiers to tailored therapeutical strategies to help clinicians choose drugs with the best response and fewer side effects, operating with genetic information and combining it with the clinical profile. This systematic review aims to draw up the state-of-the-art ML applied to pharmacogenomics in psychiatry. Our research yielded fourteen papers; most were published in the last three years. The sample comprises 9,180 patients diagnosed with mood disorders, psychoses, or autism spectrum disorders. Prediction of drug response and prediction of side effects are the most frequently considered domains with the supervised ML technique, which first requires training and then testing. The random forest is the most used algorithm; it comprises several decision trees, reduces the training set's overfitting, and makes precise predictions. ML proved effective and reliable, especially when genetic and biodemographic information were integrated into the algorithm. Even though ML and pharmacogenomics are not part of everyday clinical practice yet, they will gain a unique role in the next future in improving personalized treatments in psychiatry.
Collapse
Affiliation(s)
- Antonio Del Casale
- Department of Dynamic and Clinical Psychology and Health Studies, Faculty of Medicine and Psychology, Sapienza University; Unit of Psychiatry, ‘Sant’Andrea’ University Hospital, Rome, Italy
| | - Giuseppe Sarli
- Department of Neuroscience, Mental Health and Sensory Organs (NESMOS), Faculty of Medicine and Psychology, Sapienza University; Unit of Psychiatry, ‘Sant’Andrea’ University Hospital, Rome, Italy
| | - Paride Bargagna
- Department of Neuroscience, Mental Health and Sensory Organs (NESMOS), Faculty of Medicine and Psychology, Sapienza University; Unit of Psychiatry, ‘Sant’Andrea’ University Hospital, Rome, Italy
| | - Lorenzo Polidori
- Department of Neuroscience, Mental Health and Sensory Organs (NESMOS), Faculty of Medicine and Psychology, Sapienza University; Unit of Psychiatry, ‘Sant’Andrea’ University Hospital, Rome, Italy
| | - Alessandro Alcibiade
- Department of Neuroscience, Mental Health and Sensory Organs (NESMOS), Faculty of Medicine and Psychology, Sapienza University; Unit of Psychiatry, ‘Sant’Andrea’ University Hospital, Rome, Italy
| | - Teodolinda Zoppi
- Department of Neuroscience, Mental Health and Sensory Organs (NESMOS), Faculty of Medicine and Psychology, Sapienza University; Unit of Psychiatry, ‘Sant’Andrea’ University Hospital, Rome, Italy
| | - Marina Borro
- Department of Neuroscience, Mental Health and Sensory Organs (NESMOS), Faculty of Medicine and Psychology, Sapienza University; Unit of Laboratory and Advanced Molecular Diagnostics, ‘Sant’Andrea’ University Hospital, Rome, Italy
| | - Giovanna Gentile
- Department of Neuroscience, Mental Health and Sensory Organs (NESMOS), Faculty of Medicine and Psychology, Sapienza University; Unit of Laboratory and Advanced Molecular Diagnostics, ‘Sant’Andrea’ University Hospital, Rome, Italy
| | - Clarissa Zocchi
- Department of Neuroscience, Mental Health and Sensory Organs (NESMOS), Faculty of Medicine and Psychology, Sapienza University; Unit of Psychiatry, ‘Sant’Andrea’ University Hospital, Rome, Italy
| | - Stefano Ferracuti
- Department of Human Neuroscience, Faculty of Medicine and Dentistry, Sapienza University, Unit of Risk Management, ‘Sant’Andrea’ University Hospital, Rome, Italy
| | - Robert Preissner
- Institute of Physiology and Science-IT, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Philippstrasse 12, 10115, Berlin, Germany
| | - Maurizio Simmaco
- Department of Neuroscience, Mental Health and Sensory Organs (NESMOS), Faculty of Medicine and Psychology, Sapienza University; Unit of Laboratory and Advanced Molecular Diagnostics, ‘Sant’Andrea’ University Hospital, Rome, Italy
| | - Maurizio Pompili
- Department of Neuroscience, Mental Health and Sensory Organs (NESMOS), Faculty of Medicine and Psychology, Sapienza University; Unit of Psychiatry, ‘Sant’Andrea’ University Hospital, Rome, Italy
| |
Collapse
|
83
|
Alkhalaf M, Zhang Z, Chang HCR, Wei W, Yin M, Deng C, Yu P. Malnutrition and its contributing factors for older people living in residential aged care facilities: Insights from natural language processing of aged care records. Technol Health Care 2023; 31:2267-2278. [PMID: 37302059 DOI: 10.3233/thc-230229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
BACKGROUND Malnutrition is a serious health risk facing older people living in residential aged care facilities. Aged care staff record observations and concerns about older people in electronic health records (EHR), including free-text progress notes. These insights are yet to be unleashed. OBJECTIVE This study explored the risk factors for malnutrition in structured and unstructured electronic health data. METHODS Data of weight loss and malnutrition were extracted from the de-identified EHR records of a large aged care organization in Australia. A literature review was conducted to identify causative factors for malnutrition. Natural language processing (NLP) techniques were applied to progress notes to extract these causative factors. The NLP performance was evaluated by the parameters of sensitivity, specificity and F1-Score. RESULTS The NLP methods were highly accurate in extracting the key data, values for 46 causative variables, from the free-text client progress notes. Thirty three percent (1,469 out of 4,405) of the clients were malnourished. The structured, tabulated data only recorded 48% of these malnourished clients, far less than that (82%) identified from the progress notes, suggesting the importance of using NLP technology to uncover the information from nursing notes to fully understand the health status of the vulnerable older people in residential aged care. CONCLUSION This study identified 33% of older people suffered from malnutrition, lower than those reported in the similar setting in previous studies. Our study demonstrates that NLP technology is important for uncovering the key information about health risks for older people in residential aged care. Future research can apply NLP to predict other health risks for older people in this setting.
Collapse
Affiliation(s)
- Mohammad Alkhalaf
- Centre for Digital Transformation, School of Computing and Information Technology, University of Wollongong, Wollongong, Australia
- School of Computer Science, Qassim University, Buraydah, Saudi Arabia
| | - Zhenyu Zhang
- Centre for Digital Transformation, School of Computing and Information Technology, University of Wollongong, Wollongong, Australia
| | - Hui-Chen Rita Chang
- School of Nursing and Midwifery, Western Sydney University, Penrith, Australia
| | - Wenxi Wei
- School of Nursing, University of Wollongong, Wollongong, Australia
| | | | - Chao Deng
- School of Medical, Indigenous and Health Sciences, University of Wollongong, Wollongong, Australia
| | - Ping Yu
- Centre for Digital Transformation, School of Computing and Information Technology, University of Wollongong, Wollongong, Australia
| |
Collapse
|
84
|
A machine learning method for improving liver cancer staging. J Biomed Inform 2023; 137:104266. [PMID: 36494059 DOI: 10.1016/j.jbi.2022.104266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 11/13/2022] [Accepted: 12/01/2022] [Indexed: 12/12/2022]
Abstract
Liver cancer is a common malignant tumor, and its clinical stage is closely related to the clinical treatment and prognosis of patients. Currently, the BCLC staging system revised by the BCLC group of University of Barcelona is the globally recognized staging system for liver cancer. However, with the deepening of related research, the current staging system can no longer fully meet the clinical needs. In this work, we propose a novel machine learning method for constructing an automatic hepatocellular carcinoma staging model that incorporates far more clinical variables than any existing staging system. Our model is based on random survival forests, which generates a unique hazard function for each patient. B-splines are used to embed hazard functions into vectors in low-dimensional space and hierarchical clustering method groups similar patients to form staging cohorts. The resulting staging system significantly outperforms the BCLC system in terms of distinctiveness between patients in different stages.
Collapse
|
85
|
Hernandez B, Stiff O, Ming DK, Ho Quang C, Nguyen Lam V, Nguyen Minh T, Nguyen Van Vinh C, Nguyen Minh N, Nguyen Quang H, Phung Khanh L, Dong Thi Hoai T, Dinh The T, Huynh Trung T, Wills B, Simmons CP, Holmes AH, Yacoub S, Georgiou P. Learning meaningful latent space representations for patient risk stratification: Model development and validation for dengue and other acute febrile illness. Front Digit Health 2023; 5:1057467. [PMID: 36910574 PMCID: PMC9992802 DOI: 10.3389/fdgth.2023.1057467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 01/05/2023] [Indexed: 02/24/2023] Open
Abstract
Background Increased data availability has prompted the creation of clinical decision support systems. These systems utilise clinical information to enhance health care provision, both to predict the likelihood of specific clinical outcomes or evaluate the risk of further complications. However, their adoption remains low due to concerns regarding the quality of recommendations, and a lack of clarity on how results are best obtained and presented. Methods We used autoencoders capable of reducing the dimensionality of complex datasets in order to produce a 2D representation denoted as latent space to support understanding of complex clinical data. In this output, meaningful representations of individual patient profiles are spatially mapped in an unsupervised manner according to their input clinical parameters. This technique was then applied to a large real-world clinical dataset of over 12,000 patients with an illness compatible with dengue infection in Ho Chi Minh City, Vietnam between 1999 and 2021. Dengue is a systemic viral disease which exerts significant health and economic burden worldwide, and up to 5% of hospitalised patients develop life-threatening complications. Results The latent space produced by the selected autoencoder aligns with established clinical characteristics exhibited by patients with dengue infection, as well as features of disease progression. Similar clinical phenotypes are represented close to each other in the latent space and clustered according to outcomes broadly described by the World Health Organisation dengue guidelines. Balancing distance metrics and density metrics produced results covering most of the latent space, and improved visualisation whilst preserving utility, with similar patients grouped closer together. In this case, this balance is achieved by using the sigmoid activation function and one hidden layer with three neurons, in addition to the latent dimension layer, which produces the output (Pearson, 0.840; Spearman, 0.830; Procrustes, 0.301; GMM 0.321). Conclusion This study demonstrates that when adequately configured, autoencoders can produce two-dimensional representations of a complex dataset that conserve the distance relationship between points. The output visualisation groups patients with clinically relevant features closely together and inherently supports user interpretability. Work is underway to incorporate these findings into an electronic clinical decision support system to guide individual patient management.
Collapse
Affiliation(s)
- Bernard Hernandez
- Centre for Bio-Inspired Technology, Imperial College London, London, United Kingdom.,Centre for Amtimicrobial Optimisation, Imperial College London, London, United Kingdom
| | - Oliver Stiff
- Centre for Bio-Inspired Technology, Imperial College London, London, United Kingdom
| | - Damien K Ming
- Centre for Amtimicrobial Optimisation, Imperial College London, London, United Kingdom.,NIHR HPRU in Healthcare Associated Infections and Antimicrobial Resistance, Imperial College London, London, United Kingdom
| | - Chanh Ho Quang
- Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam
| | - Vuong Nguyen Lam
- Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam.,University of Medicine and Pharmacy, Ho Chi Minh City, Vietnam
| | | | - Chau Nguyen Van Vinh
- Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam.,Hospital for Tropical Diseases, Ho Chi Minh City, Vietnam
| | | | - Huy Nguyen Quang
- Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam
| | - Lam Phung Khanh
- Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam.,University of Medicine and Pharmacy, Ho Chi Minh City, Vietnam
| | | | - Trung Dinh The
- Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam
| | - Trieu Huynh Trung
- Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam.,Hospital for Tropical Diseases, Ho Chi Minh City, Vietnam
| | - Bridget Wills
- Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam.,Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | - Cameron P Simmons
- Institute of Vector Borne Disease, Monash University, Melbourne, VIC, Australia
| | - Alison H Holmes
- Centre for Amtimicrobial Optimisation, Imperial College London, London, United Kingdom.,NIHR HPRU in Healthcare Associated Infections and Antimicrobial Resistance, Imperial College London, London, United Kingdom
| | - Sophie Yacoub
- Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam.,Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | - Pantelis Georgiou
- Centre for Bio-Inspired Technology, Imperial College London, London, United Kingdom.,Centre for Amtimicrobial Optimisation, Imperial College London, London, United Kingdom
| | | |
Collapse
|
86
|
Guo J, Shan S, Ali Khan Y. What are the impetuses Behind E-health applications' self-management services' ongoing adoption by health community participants? Health Informatics J 2023; 29:14604582231152801. [PMID: 36648056 DOI: 10.1177/14604582231152801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Over the past 20 years, the identification of interventions related to healthcare management has been greatly facilitated by improvements in the well-being and health of the entire population. However, regardless of the positive developments in smart health applications and e-health research, there are two important gaps, (1) the role of gamification variables in the continued use of eHealth applications has not been adequately assessed, and (2) the extent to which people's perception of the continued use of e-health applications is encouraged through habit. Customers and companies can derive considerable value from exploring E-Health applications' health self-management services. Accordingly, estimating such services' ongoing adoption by customers is aimed for in this research, with habits, intrinsic and extrinsic variables incorporated into a study model which is then tested. This paper examined perceived autonomy, perceived competence, perceived relatedness has positively related to enjoyment and habit. Reward has positively related to perceived autonomy and continued to use. Enjoyment and Habit have positively associated with the decision to continue to use in e-Health Apps. 269 individuals who have used Chinese e-health applications comprised the data collection sample, being reached via an online questionnaire. Data analysis was undertaken using Partial Least Squares Structural Equation Modelling (PLS-SEM). It was found that the ongoing adoption of e-health self-management services was perpetuated to a greater extent by intrinsic variables; in terms of strategizing for companies' e-services, the results can inform this process.
Collapse
Affiliation(s)
- Jin Guo
- Department of Management, Lincoln International Business School, 4547University of Lincoln, UK
| | - Shan Shan
- Business Analytics and Decision Making, 2706Coventry University, UK
| | - Yousaf Ali Khan
- Department of Mathematics and Statistics, 66934Hazara University, Pakistan
| |
Collapse
|
87
|
Liang Y, Guo C. Heart failure disease prediction and stratification with temporal electronic health records data using patient representation. Biocybern Biomed Eng 2023. [DOI: 10.1016/j.bbe.2022.12.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
|
88
|
Soman K, Nelson CA, Cerono G, Baranzini SE. Time-aware Embeddings of Clinical Data using a Knowledge Graph. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2023; 28:97-108. [PMID: 36540968 PMCID: PMC9782808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Meaningful representations of clinical data using embedding vectors is a pivotal step to invoke any machine learning (ML) algorithm for data inference. In this article, we propose a time-aware embedding approach of electronic health records onto a biomedical knowledge graph for creating machine readable patient representations. This approach not only captures the temporal dynamics of patient clinical trajectories, but also enriches it with additional biological information from the knowledge graph. To gauge the predictivity of this approach, we propose an ML pipeline called TANDEM (Temporal and Non-temporal Dynamics Embedded Model) and apply it on the early detection of Parkinson's disease. TANDEM results in a classification AUC score of 0.85 on unseen test dataset. These predictions are further explained by providing a biological insight using the knowledge graph. Taken together, we show that temporal embeddings of clinical data could be a meaningful predictive representation for downstream ML pipelines in clinical decision-making.
Collapse
|
89
|
Zhu Y, Wang M, Yin X, Zhang J, Meijering E, Hu J. Deep Learning in Diverse Intelligent Sensor Based Systems. SENSORS (BASEL, SWITZERLAND) 2022; 23:s23010062. [PMID: 36616657 PMCID: PMC9823653 DOI: 10.3390/s23010062] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 12/06/2022] [Accepted: 12/14/2022] [Indexed: 05/27/2023]
Abstract
Deep learning has become a predominant method for solving data analysis problems in virtually all fields of science and engineering. The increasing complexity and the large volume of data collected by diverse sensor systems have spurred the development of deep learning methods and have fundamentally transformed the way the data are acquired, processed, analyzed, and interpreted. With the rapid development of deep learning technology and its ever-increasing range of successful applications across diverse sensor systems, there is an urgent need to provide a comprehensive investigation of deep learning in this domain from a holistic view. This survey paper aims to contribute to this by systematically investigating deep learning models/methods and their applications across diverse sensor systems. It also provides a comprehensive summary of deep learning implementation tips and links to tutorials, open-source codes, and pretrained models, which can serve as an excellent self-contained reference for deep learning practitioners and those seeking to innovate deep learning in this space. In addition, this paper provides insights into research topics in diverse sensor systems where deep learning has not yet been well-developed, and highlights challenges and future opportunities. This survey serves as a catalyst to accelerate the application and transformation of deep learning in diverse sensor systems.
Collapse
Affiliation(s)
- Yanming Zhu
- School of Computer Science and Engineering, University of New South Wales, Sydney, NSW 2052, Australia
| | - Min Wang
- School of Engineering and Information Technology, University of New South Wales, Canberra, ACT 2612, Australia
| | - Xuefei Yin
- School of Engineering and Information Technology, University of New South Wales, Canberra, ACT 2612, Australia
| | - Jue Zhang
- School of Engineering and Information Technology, University of New South Wales, Canberra, ACT 2612, Australia
| | - Erik Meijering
- School of Computer Science and Engineering, University of New South Wales, Sydney, NSW 2052, Australia
| | - Jiankun Hu
- School of Engineering and Information Technology, University of New South Wales, Canberra, ACT 2612, Australia
| |
Collapse
|
90
|
Ferrara M, Franchini G, Funaro M, Cutroni M, Valier B, Toffanin T, Palagini L, Zerbinati L, Folesani F, Murri MB, Caruso R, Grassi L. Machine Learning and Non-Affective Psychosis: Identification, Differential Diagnosis, and Treatment. Curr Psychiatry Rep 2022; 24:925-936. [PMID: 36399236 PMCID: PMC9780131 DOI: 10.1007/s11920-022-01399-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/12/2022] [Indexed: 11/19/2022]
Abstract
PURPOSE OF REVIEW This review will cover the most relevant findings on the use of machine learning (ML) techniques in the field of non-affective psychosis, by summarizing the studies published in the last three years focusing on illness detection and treatment. RECENT FINDINGS Multiple ML tools that include mostly supervised approaches such as support vector machine, gradient boosting, and random forest showed promising results by applying these algorithms to various sources of data: socio-demographic information, EEG, language, digital content, blood biomarkers, neuroimaging, and electronic health records. However, the overall performance, in the binary classification case, varied from 0.49, which is to be considered very low (i.e., noise), to over 0.90. These results are fully justified by different factors, some of which may be attributable to the preprocessing of the data, the wide variety of the data, and the a-priori setting of hyperparameters. One of the main limitations of the field is the lack of stratification of results based on biological sex, given that psychosis presents differently in men and women; hence, the necessity to tailor identification tools and data analytic strategies. Timely identification and appropriate treatment are key factors in reducing the consequences of psychotic disorders. In recent years, the emergence of new analytical tools based on artificial intelligence such as supervised ML approaches showed promises as a potential breakthrough in this field. However, ML applications in everyday practice are still in its infancy.
Collapse
Affiliation(s)
- Maria Ferrara
- Department of Neuroscience and Rehabilitation, Institute of Psychiatry, University of Ferrara, via Fossato di Mortara 64/A, Ferrara, Italy.
- Department of Psychiatry, Yale School of Medicine, 34 Park Street, New Haven, CT, USA.
| | - Giorgia Franchini
- Department of Physics, Informatics and Mathematics, University of Modena and Reggio Emilia, Via Campi 213/B, Modena, Italy
- Department of Mathematics and Computer Science, University of Ferrara, Via Macchiavelli 33, Ferrara, Italy
| | - Melissa Funaro
- Harvey Cushing/John Hay Whitney Medical Library, Yale University, 333 Cedar St., New Haven, CT, USA
| | - Marcello Cutroni
- Department of Neuroscience and Rehabilitation, Institute of Psychiatry, University of Ferrara, via Fossato di Mortara 64/A, Ferrara, Italy
| | - Beatrice Valier
- Department of Neuroscience and Rehabilitation, Institute of Psychiatry, University of Ferrara, via Fossato di Mortara 64/A, Ferrara, Italy
| | - Tommaso Toffanin
- Department of Neuroscience and Rehabilitation, Institute of Psychiatry, University of Ferrara, via Fossato di Mortara 64/A, Ferrara, Italy
| | - Laura Palagini
- Department of Neuroscience and Rehabilitation, Institute of Psychiatry, University of Ferrara, via Fossato di Mortara 64/A, Ferrara, Italy
| | - Luigi Zerbinati
- Department of Neuroscience and Rehabilitation, Institute of Psychiatry, University of Ferrara, via Fossato di Mortara 64/A, Ferrara, Italy
| | - Federica Folesani
- Department of Neuroscience and Rehabilitation, Institute of Psychiatry, University of Ferrara, via Fossato di Mortara 64/A, Ferrara, Italy
| | - Martino Belvederi Murri
- Department of Neuroscience and Rehabilitation, Institute of Psychiatry, University of Ferrara, via Fossato di Mortara 64/A, Ferrara, Italy
| | - Rosangela Caruso
- Department of Neuroscience and Rehabilitation, Institute of Psychiatry, University of Ferrara, via Fossato di Mortara 64/A, Ferrara, Italy
| | - Luigi Grassi
- Department of Neuroscience and Rehabilitation, Institute of Psychiatry, University of Ferrara, via Fossato di Mortara 64/A, Ferrara, Italy
| |
Collapse
|
91
|
Luo M, Wang YT, Wang XK, Hou WH, Huang RL, Liu Y, Wang JQ. A multi-granularity convolutional neural network model with temporal information and attention mechanism for efficient diabetes medical cost prediction. Comput Biol Med 2022; 151:106246. [PMID: 36343403 DOI: 10.1016/j.compbiomed.2022.106246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 09/30/2022] [Accepted: 10/22/2022] [Indexed: 12/27/2022]
Abstract
As the cost of diabetes treatment continues to grow, it is critical to accurately predict the medical costs of diabetes. Most medical cost studies based on convolutional neural networks (CNNs) ignore the importance of multi-granularity information of medical concepts and time interval characteristics of patients' multiple visit sequences, which reflect the frequency of patient visits and the severity of the disease. Therefore, this paper proposes a new end-to-end deep neural network structure, MST-CNN, for medical cost prediction. The MST-CNN model improves the representation quality of medical concepts by constructing a multi-granularity embedding model of medical concepts and incorporates a time interval vector to accurately measure the frequency of patient visits and form an accurate representation of medical events. Moreover, the MST-CNN model integrates a channel attention mechanism to adaptively adjust the channel weights to focus on significant medical features. The MST-CNN model systematically addresses the problem of deep learning models for temporal data representation. A case study and three comparative experiments are conducted using data collected from Pingjiang County. Through experiments, the methods used in the proposed model are analyzed, and the super contribution of the model performance is demonstrated.
Collapse
Affiliation(s)
- Min Luo
- School of Business, Central South University, Changsha, 410083, PR China
| | - Yi-Ting Wang
- School of Business, Central South University, Changsha, 410083, PR China
| | - Xiao-Kang Wang
- School of Business, Central South University, Changsha, 410083, PR China
| | - Wen-Hui Hou
- School of Business, Central South University, Changsha, 410083, PR China
| | - Rui-Lu Huang
- School of Business, Central South University, Changsha, 410083, PR China
| | - Ye Liu
- School of Business, Central South University, Changsha, 410083, PR China
| | - Jian-Qiang Wang
- School of Business, Central South University, Changsha, 410083, PR China.
| |
Collapse
|
92
|
Datta S, Morassi Sasso A, Kiwit N, Bose S, Nadkarni G, Miotto R, Böttinger EP. Predicting hypertension onset from longitudinal electronic health records with deep learning. JAMIA Open 2022; 5:ooac097. [PMID: 36448021 PMCID: PMC9696747 DOI: 10.1093/jamiaopen/ooac097] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 10/26/2022] [Accepted: 11/07/2022] [Indexed: 04/14/2024] Open
Abstract
Objective Hypertension has long been recognized as one of the most important predisposing factors for cardiovascular diseases and mortality. In recent years, machine learning methods have shown potential in diagnostic and predictive approaches in chronic diseases. Electronic health records (EHRs) have emerged as a reliable source of longitudinal data. The aim of this study is to predict the onset of hypertension using modern deep learning (DL) architectures, specifically long short-term memory (LSTM) networks, and longitudinal EHRs. Materials and Methods We compare this approach to the best performing models reported from previous works, particularly XGboost, applied to aggregated features. Our work is based on data from 233 895 adult patients from a large health system in the United States. We divided our population into 2 distinct longitudinal datasets based on the diagnosis date. To ensure generalization to unseen data, we trained our models on the first dataset (dataset A "train and validation") using cross-validation, and then applied the models to a second dataset (dataset B "test") to assess their performance. We also experimented with 2 different time-windows before the onset of hypertension and evaluated the impact on model performance. Results With the LSTM network, we were able to achieve an area under the receiver operating characteristic curve value of 0.98 in the "train and validation" dataset A and 0.94 in the "test" dataset B for a prediction time window of 1 year. Lipid disorders, type 2 diabetes, and renal disorders are found to be associated with incident hypertension. Conclusion These findings show that DL models based on temporal EHR data can improve the identification of patients at high risk of hypertension and corresponding driving factors. In the long term, this work may support identifying individuals who are at high risk for developing hypertension and facilitate earlier intervention to prevent the future development of hypertension.
Collapse
Affiliation(s)
- Suparno Datta
- Digital Health Center, Hasso Plattner Institute, University of Potsdam, Potsdam, Germany
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Ariane Morassi Sasso
- Digital Health Center, Hasso Plattner Institute, University of Potsdam, Potsdam, Germany
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Nina Kiwit
- Digital Health Center, Hasso Plattner Institute, University of Potsdam, Potsdam, Germany
| | - Subhronil Bose
- Digital Health Center, Hasso Plattner Institute, University of Potsdam, Potsdam, Germany
| | - Girish Nadkarni
- Digital Health Center, Hasso Plattner Institute, University of Potsdam, Potsdam, Germany
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Riccardo Miotto
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Erwin P Böttinger
- Digital Health Center, Hasso Plattner Institute, University of Potsdam, Potsdam, Germany
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Windreich Department of Artificial Intelligence and Human Health, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| |
Collapse
|
93
|
Explaining predictive factors in patient pathways using autoencoders. PLoS One 2022; 17:e0277135. [DOI: 10.1371/journal.pone.0277135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 10/20/2022] [Indexed: 11/12/2022] Open
Abstract
This paper introduces an end-to-end methodology to predict a pathway-related outcome and identifying predictive factors using autoencoders. A formal description of autoencoders for explainable binary predictions is presented, along with two objective functions that allows for filtering and inverting negative examples during training. A methodology to model and transform complex medical event logs is also proposed, which keeps the pathway information in terms of events and time, as well as the hierarchy information carried in medical codes. A case study is presented, in which the short-term mortality after the implementation of an Implantable Cardioverter-Defibrillator is predicted. Proposed methodologies have been tested and compared to other predictive methods, both explainable and not explainable. Results show the competitiveness of the method in terms of performances, particularly the use of a Variational Auto Encoder with an inverse objective function. Finally, the explainability of the method has been demonstrated, allowing for the identification of interesting predictive factors validated using relative risks.
Collapse
|
94
|
Carruthers R, Straw I, Ruffle JK, Herron D, Nelson A, Bzdok D, Fernandez-Reyes D, Rees G, Nachev P. Representational ethical model calibration. NPJ Digit Med 2022; 5:170. [PMID: 36333390 PMCID: PMC9636204 DOI: 10.1038/s41746-022-00716-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 10/19/2022] [Indexed: 11/06/2022] Open
Abstract
Equity is widely held to be fundamental to the ethics of healthcare. In the context of clinical decision-making, it rests on the comparative fidelity of the intelligence - evidence-based or intuitive - guiding the management of each individual patient. Though brought to recent attention by the individuating power of contemporary machine learning, such epistemic equity arises in the context of any decision guidance, whether traditional or innovative. Yet no general framework for its quantification, let alone assurance, currently exists. Here we formulate epistemic equity in terms of model fidelity evaluated over learnt multidimensional representations of identity crafted to maximise the captured diversity of the population, introducing a comprehensive framework for Representational Ethical Model Calibration. We demonstrate the use of the framework on large-scale multimodal data from UK Biobank to derive diverse representations of the population, quantify model performance, and institute responsive remediation. We offer our approach as a principled solution to quantifying and assuring epistemic equity in healthcare, with applications across the research, clinical, and regulatory domains.
Collapse
Affiliation(s)
- Robert Carruthers
- Department of Computer Science, University College London, London, UK.
| | - Isabel Straw
- UCL Queen Square Institute of Neurology, University College London, London, UK
| | - James K Ruffle
- UCL Queen Square Institute of Neurology, University College London, London, UK
| | - Daniel Herron
- Research and Development, NIHR University College London Hospitals Biomedical Research Centre, London, UK
| | - Amy Nelson
- UCL Queen Square Institute of Neurology, University College London, London, UK
| | - Danilo Bzdok
- Department of Biomedical Engineering, Faculty of Medicine, McGill University, Montreal, Canada
| | | | - Geraint Rees
- UCL Queen Square Institute of Neurology, University College London, London, UK
| | - Parashkev Nachev
- UCL Queen Square Institute of Neurology, University College London, London, UK.
| |
Collapse
|
95
|
de Jonge M, Wubben N, van Kaam CR, Frenzel T, Hoedemaekers CWE, Ambrogioni L, van der Hoeven JG, van den Boogaard M, Zegers M. Optimizing an existing prediction model for quality of life one-year post-intensive care unit: An exploratory analysis. Acta Anaesthesiol Scand 2022; 66:1228-1236. [PMID: 36054515 DOI: 10.1111/aas.14138] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 07/12/2022] [Accepted: 07/31/2022] [Indexed: 01/07/2023]
Abstract
BACKGROUND This study aimed to improve the PREPARE model, an existing linear regression prediction model for long-term quality of life (QoL) of intensive care unit (ICU) survivors by incorporating additional ICU data from patients' electronic health record (EHR) and bedside monitors. METHODS The 1308 adult ICU patients, aged ≥16, admitted between July 2016 and January 2019 were included. Several regression-based machine learning models were fitted on a combination of patient-reported data and expert-selected EHR variables and bedside monitor data to predict change in QoL 1 year after ICU admission. Predictive performance was compared to a five-feature linear regression prediction model using only 24-hour data (R2 = 0.54, mean square error (MSE) = 0.031, mean absolute error (MAE) = 0.128). RESULTS The 67.9% of the included ICU survivors was male and the median age was 65.0 [IQR: 57.0-71.0]. Median length of stay (LOS) was 1 day [IQR 1.0-2.0]. The incorporation of the additional data pertaining to the entire ICU stay did not improve the predictive performance of the original linear regression model. The best performing machine learning model used seven features (R2 = 0.52, MSE = 0.032, MAE = 0.125). Pre-ICU QoL, the presence of a cerebro vascular accident (CVA) upon admission and the highest temperature measured during the ICU stay were the most important contributors to predictive performance. Pre-ICU QoL's contribution to predictive performance far exceeded that of the other predictors. CONCLUSION Pre-ICU QoL was by far the most important predictor for change in QoL 1 year after ICU admission. The incorporation of the numerous additional features pertaining to the entire ICU stay did not improve predictive performance although the patients' LOS was relatively short.
Collapse
Affiliation(s)
- Manon de Jonge
- Department Intensive Care Medicine, Radboud University Medical Center, Radboud Institute for Health Sciences, Nijmegen, Netherlands
| | - Nina Wubben
- Department Intensive Care Medicine, Radboud University Medical Center, Radboud Institute for Health Sciences, Nijmegen, Netherlands
| | - Christiaan R van Kaam
- Department Intensive Care Medicine, Radboud University Medical Center, Radboud Institute for Health Sciences, Nijmegen, Netherlands
| | - Tim Frenzel
- Department Intensive Care Medicine, Radboud University Medical Center, Radboud Institute for Health Sciences, Nijmegen, Netherlands
| | - Cornelia W E Hoedemaekers
- Department Intensive Care Medicine, Radboud University Medical Center, Radboud Institute for Health Sciences, Nijmegen, Netherlands
| | - Luca Ambrogioni
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, Netherlands
| | - Johannes G van der Hoeven
- Department Intensive Care Medicine, Radboud University Medical Center, Radboud Institute for Health Sciences, Nijmegen, Netherlands
| | - Mark van den Boogaard
- Department Intensive Care Medicine, Radboud University Medical Center, Radboud Institute for Health Sciences, Nijmegen, Netherlands
| | - Marieke Zegers
- Department Intensive Care Medicine, Radboud University Medical Center, Radboud Institute for Health Sciences, Nijmegen, Netherlands
| |
Collapse
|
96
|
Rabhi S, Blanchard F, Diallo AM, Zeghlache D, Lukas C, Berot A, Delemer B, Barraud S. Temporal deep learning framework for retinopathy prediction in patients with type 1 diabetes. Artif Intell Med 2022; 133:102408. [PMID: 36328668 DOI: 10.1016/j.artmed.2022.102408] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2021] [Revised: 09/17/2022] [Accepted: 09/21/2022] [Indexed: 12/13/2022]
Abstract
The adoption of electronic health records in hospitals has ensured the availability of large datasets that can be used to predict medical complications. The trajectories of patients in real-world settings are highly variable, making longitudinal data modeling challenging. In recent years, significant progress has been made in the study of deep learning models applied to time series; however, the application of these models to irregular medical time series (IMTS) remains limited. To address this issue, we developed a generic deep-learning-based framework for modeling IMTS that facilitates the comparative studies of sequential neural networks (transformers and long short-term memory) and irregular time representation techniques. A validation study to predict retinopathy complications was conducted on 1207 patients with type 1 diabetes in a French database using their historical glycosylated hemoglobin measurements, without any data aggregation or imputation. The transformer-based model combined with the soft one-hot representation of time gaps achieved the highest score: an area under the receiver operating characteristic curve of 88.65%, specificity of 85.56%, sensitivity of 83.33% and an improvement of 11.7% over the same architecture without time information. This is the first attempt to predict retinopathy complications in patients with type 1 diabetes using deep learning and longitudinal data collected from patient visits. This study highlighted the significance of modeling time gaps between medical records to improve prediction performance and the utility of a generic framework for conducting extensive comparative studies.
Collapse
Affiliation(s)
- Sara Rabhi
- Department RS2M, Télécom SudParis, 9 rue Charles Fourier, Evry, 91000, France.
| | - Frédéric Blanchard
- CRESTIC EA 3804, Université Reims Champagne-Ardenne, UFR Sciences Exactes et Naturelles, Moulin de la Housse, 51687, Reims, France
| | - Alpha Mamadou Diallo
- CHU de Reims - Hôpital Robert Debré, Service d'Endocrinologie - Diabète - Nutrition, Avenue du Général Koenig, 51092, Reims, France; Laboratoire de recherche en Santé Publique, Vieillissement, Qualité de vie et Réadaptation des Sujets Fragiles, EA 3797, Université Reims Champagne-Ardenne, 51092, Reims, France
| | - Djamal Zeghlache
- Department RS2M, Télécom SudParis, 9 rue Charles Fourier, Evry, 91000, France
| | - Céline Lukas
- CHU de Reims - Hôpital Robert Debré, Service d'Endocrinologie - Diabète - Nutrition, Avenue du Général Koenig, 51092, Reims, France; Laboratoire de recherche en Santé Publique, Vieillissement, Qualité de vie et Réadaptation des Sujets Fragiles, EA 3797, Université Reims Champagne-Ardenne, 51092, Reims, France
| | - Aurélie Berot
- CHU de Reims - American Memorial Hospital - Service de Pédiatrie, 47 rue Cognac Jay, 51092, Reims, France; Laboratoire d'Education et Pratiques de Santé, EA 3412, Université Sorbonne Paris Nord, 74 rue Marcel Cachin, 93017, Bobigny, France
| | - Brigitte Delemer
- CRESTIC EA 3804, Université Reims Champagne-Ardenne, UFR Sciences Exactes et Naturelles, Moulin de la Housse, 51687, Reims, France; CHU de Reims - Hôpital Robert Debré, Service d'Endocrinologie - Diabète - Nutrition, Avenue du Général Koenig, 51092, Reims, France
| | - Sara Barraud
- CRESTIC EA 3804, Université Reims Champagne-Ardenne, UFR Sciences Exactes et Naturelles, Moulin de la Housse, 51687, Reims, France; CHU de Reims - Hôpital Robert Debré, Service d'Endocrinologie - Diabète - Nutrition, Avenue du Général Koenig, 51092, Reims, France
| |
Collapse
|
97
|
Huang J, Zhao Y, Qu W, Tian Z, Tan Y, Wang Z, Tan S. Automatic recognition of schizophrenia from facial videos using 3D convolutional neural network. Asian J Psychiatr 2022; 77:103263. [PMID: 36152565 DOI: 10.1016/j.ajp.2022.103263] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 08/22/2022] [Accepted: 09/14/2022] [Indexed: 11/17/2022]
Abstract
Schizophrenia affects patients and their families and society because of chronic impairments in cognition, behavior, and emotion. However, its clinical diagnosis mainly depends on the clinicians' knowledge of the patients' symptoms. Other auxiliary diagnostic methods such as MRI and EEG are cumbersome and time-consuming. Recently, the convolutional neural network (CNN) has been applied to the auxiliary diagnosis of psychiatry. Hence, in this study, a method based on deep learning and facial videos is proposed for the rapid detection of schizophrenia. Herein, 125 videos from 125 schizophrenic patients and 75 videos from 75 healthy controls based on emotional stimulation tasks were obtained. The video preprocessing included the experiment clips extraction, face detection, facial region cropping, resizing to 500 × 500 pixel size, and uniform sampling of 100 frames. The preprocessed facial videos were used to train the Resnet18_3D. We utilized ten-fold cross-validation, and held-out testing set to evaluate the model with the accuracy, the precision, the sensitivity, the specificity, the balanced accuracy, and the AUC. The Resnet18_3D trained on Film_order achieved the best performance with accuracy, sensitivity, specificity, balanced accuracy, and AUC of 89.00%, 96.80%, 76.00%, 86.40% and 0.9397. The neural network model indeed recognizes healthy controls and schizophrenic patients through the changes in the area of the face. The results show that facial video under emotional stimulation can be used to classify schizophrenic patients and help clinicians with diagnosis in the clinical environment. Among the different types of stimuli, the video stimuli with fixed emotional order showed the best classification performance.
Collapse
Affiliation(s)
- Jie Huang
- Beijing HuiLongGuan Hospital, Peking University HuiLongGuan Clinical Medical School, Beijing, 100096, China
| | - Yanli Zhao
- Beijing HuiLongGuan Hospital, Peking University HuiLongGuan Clinical Medical School, Beijing, 100096, China
| | - Wei Qu
- Beijing HuiLongGuan Hospital, Peking University HuiLongGuan Clinical Medical School, Beijing, 100096, China
| | - Zhanxiao Tian
- Beijing HuiLongGuan Hospital, Peking University HuiLongGuan Clinical Medical School, Beijing, 100096, China
| | - Yunlong Tan
- Beijing HuiLongGuan Hospital, Peking University HuiLongGuan Clinical Medical School, Beijing, 100096, China
| | - Zhiren Wang
- Beijing HuiLongGuan Hospital, Peking University HuiLongGuan Clinical Medical School, Beijing, 100096, China
| | - Shuping Tan
- Beijing HuiLongGuan Hospital, Peking University HuiLongGuan Clinical Medical School, Beijing, 100096, China.
| |
Collapse
|
98
|
Simulation of a machine learning enabled learning health system for risk prediction using synthetic patient data. Sci Rep 2022; 12:17917. [PMID: 36289292 PMCID: PMC9606301 DOI: 10.1038/s41598-022-23011-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Accepted: 10/21/2022] [Indexed: 01/20/2023] Open
Abstract
When enabled by machine learning (ML), Learning Health Systems (LHS) hold promise for improving the effectiveness of healthcare delivery to patients. One major barrier to LHS research and development is the lack of access to EHR patient data. To overcome this challenge, this study demonstrated the feasibility of developing a simulated ML-enabled LHS using synthetic patient data. The ML-enabled LHS was initialized using a dataset of 30,000 synthetic Synthea patients and a risk prediction XGBoost base model for lung cancer. 4 additional datasets of 30,000 patients were generated and added to the previous updated dataset sequentially to simulate addition of new patients, resulting in datasets of 60,000, 90,000, 120,000 and 150,000 patients. New XGBoost models were built in each instance, and performance improved with data size increase, attaining 0.936 recall and 0.962 AUC (area under curve) in the 150,000 patients dataset. The effectiveness of the new ML-enabled LHS process was verified by implementing XGBoost models for stroke risk prediction on the same Synthea patient populations. By making the ML code and synthetic patient data publicly available for testing and training, this first synthetic LHS process paves the way for more researchers to start developing LHS with real patient data.
Collapse
|
99
|
Nagamine T, Gillette B, Kahoun J, Burghaus R, Lippert J, Saxena M. Data-driven identification of heart failure disease states and progression pathways using electronic health records. Sci Rep 2022; 12:17871. [PMID: 36284167 PMCID: PMC9596465 DOI: 10.1038/s41598-022-22398-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Accepted: 10/13/2022] [Indexed: 01/20/2023] Open
Abstract
Heart failure (HF) is a leading cause of morbidity, healthcare costs, and mortality. Guideline based segmentation of HF into distinct subtypes is coarse and unlikely to reflect the heterogeneity of etiologies and disease trajectories of patients. While analyses of electronic health records show promise in expanding our understanding of complex syndromes like HF in an evidence-driven way, limitations in data quality have presented challenges for large-scale EHR-based insight generation and decision-making. We present a hypothesis-free approach to generating real-world characteristics and progression patterns of HF. Patient disease state snapshots are extracted from the complaints mentioned in unstructured clinical notes. Typical disease states are generated by clustering and characterized in terms of their distinguishing features, temporal relationships, and risk of important clinical events. Our analysis generates a comprehensive "disease phenome" of real-world patients computed from large, noisy, secondary-use EHR datasets created in a routine clinical setting.
Collapse
Affiliation(s)
| | - Brian Gillette
- grid.137628.90000 0004 1936 8753Department of Surgery, NYU Langone Long Island, Mineola, NY USA ,Department of Foundations of Medicine, NYU Long Island School of Medicine, Mineola, NY USA
| | - John Kahoun
- Droice Research, New York, NY USA ,Clinical Informatics, CityMD, New York, NY USA
| | - Rolf Burghaus
- grid.420044.60000 0004 0374 4101Bayer AG, Wuppertal, Germany
| | - Jörg Lippert
- grid.420044.60000 0004 0374 4101Bayer AG, Wuppertal, Germany
| | | |
Collapse
|
100
|
Kleiman MJ, Plewes AD, Owora A, Grout RW, Dexter PR, Fowler NR, Galvin JE, Miled ZB, Boustani M. Digital detection of dementia (D 3): a study protocol for a pragmatic cluster-randomized trial examining the application of patient-reported outcomes and passive clinical decision support systems. Trials 2022; 23:868. [PMID: 36221141 PMCID: PMC9552361 DOI: 10.1186/s13063-022-06809-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Accepted: 09/30/2022] [Indexed: 11/14/2022] Open
Abstract
BACKGROUND Early detection of Alzheimer's disease and related dementias (ADRD) in a primary care setting is challenging due to time constraints and stigma. The implementation of scalable, sustainable, and patient-driven processes may improve early detection of ADRD; however, there are competing approaches; information may be obtained either directly from a patient (e.g., through a questionnaire) or passively using electronic health record (EHR) data. In this study, we aim to identify the benefit of a combined approach using a pragmatic cluster-randomized clinical trial. METHODS We have developed a Passive Digital Marker (PDM), based on machine learning algorithms applied to EHR data, and paired it with a patient-reported outcome (the Quick Dementia Rating Scale or QDRS) to rapidly share an identified risk of impairment to a patient's physician. Clinics in both south Florida and Indiana will be randomly assigned to one of three study arms: 1200 patients in each of the two populations will be administered either the PDM, the PDM with the QDRS, or neither, for a total of 7200 patients across all clinics and populations. Both incidence of ADRD diagnosis and acceptance into ADRD diagnostic work-up regimens is hypothesized to increase when patients are administered both the PDM and QDRS. Physicians performing the work-up regimens will be blind to the study arm of the patient. DISCUSSION This study aims to test the accuracy and effectiveness of the two scalable approaches (PDM and QDRS) for the early detection of ADRD among older adults attending primary care practices. The data obtained in this study may lead to national early detection and management program for ADRD as an efficient and beneficial method of reducing the current and future burden of ADRD, as well as improving the annual rate of newly documented ADRD in primary care practices. TRIAL REGISTRATION ClinicalTrials.gov Identifier: NCT05231954 . Registered February 9, 2022.
Collapse
Affiliation(s)
- Michael J Kleiman
- Comprehensive Center for Brain Health, Department of Neurology, University of Miami Miller School of Medicine, 7700 W Camino Real, Suite 200, Boca Raton, FL, 33433, USA.
| | - Abbi D Plewes
- Indiana University School of Medicine, Indianapolis, IN, 46202, USA
- Center for Health Innovation and Implementation Science, Indiana Clinical and Translational Science Institute, Indianapolis, IN, 46202, USA
| | - Arthur Owora
- Indiana University Bloomington School of Public Health, Bloomington, IN, 47405, USA
| | - Randall W Grout
- Indiana University School of Medicine, Indianapolis, IN, 46202, USA
- Regenstrief Institute, Inc., Indianapolis, IN, 46202, USA
| | - Paul Richard Dexter
- Indiana University School of Medicine, Indianapolis, IN, 46202, USA
- Regenstrief Institute, Inc., Indianapolis, IN, 46202, USA
| | - Nicole R Fowler
- Indiana University School of Medicine, Indianapolis, IN, 46202, USA
- Center for Health Innovation and Implementation Science, Indiana Clinical and Translational Science Institute, Indianapolis, IN, 46202, USA
- Regenstrief Institute, Inc., Indianapolis, IN, 46202, USA
- Indiana University Center for Aging Research, Indianapolis, IN, 46202, USA
| | - James E Galvin
- Comprehensive Center for Brain Health, Department of Neurology, University of Miami Miller School of Medicine, 7700 W Camino Real, Suite 200, Boca Raton, FL, 33433, USA
| | - Zina Ben Miled
- Indiana University School of Medicine, Indianapolis, IN, 46202, USA
| | - Malaz Boustani
- Indiana University School of Medicine, Indianapolis, IN, 46202, USA
- Center for Health Innovation and Implementation Science, Indiana Clinical and Translational Science Institute, Indianapolis, IN, 46202, USA
- Regenstrief Institute, Inc., Indianapolis, IN, 46202, USA
- Indiana University Center for Aging Research, Indianapolis, IN, 46202, USA
| |
Collapse
|