1
|
Sim JA, Huang X, Horan MR, Stewart CM, Robison LL, Hudson MM, Baker JN, Huang IC. Natural language processing with machine learning methods to analyze unstructured patient-reported outcomes derived from electronic health records: A systematic review. Artif Intell Med 2023; 146:102701. [PMID: 38042599 PMCID: PMC10693655 DOI: 10.1016/j.artmed.2023.102701] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 09/30/2023] [Accepted: 10/29/2023] [Indexed: 12/04/2023]
Abstract
OBJECTIVE Natural language processing (NLP) combined with machine learning (ML) techniques are increasingly used to process unstructured/free-text patient-reported outcome (PRO) data available in electronic health records (EHRs). This systematic review summarizes the literature reporting NLP/ML systems/toolkits for analyzing PROs in clinical narratives of EHRs and discusses the future directions for the application of this modality in clinical care. METHODS We searched PubMed, Scopus, and Web of Science for studies written in English between 1/1/2000 and 12/31/2020. Seventy-nine studies meeting the eligibility criteria were included. We abstracted and summarized information related to the study purpose, patient population, type/source/amount of unstructured PRO data, linguistic features, and NLP systems/toolkits for processing unstructured PROs in EHRs. RESULTS Most of the studies used NLP/ML techniques to extract PROs from clinical narratives (n = 74) and mapped the extracted PROs into specific PRO domains for phenotyping or clustering purposes (n = 26). Some studies used NLP/ML to process PROs for predicting disease progression or onset of adverse events (n = 22) or developing/validating NLP/ML pipelines for analyzing unstructured PROs (n = 19). Studies used different linguistic features, including lexical, syntactic, semantic, and contextual features, to process unstructured PROs. Among the 25 NLP systems/toolkits we identified, 15 used rule-based NLP, 6 used hybrid NLP, and 4 used non-neural ML algorithms embedded in NLP. CONCLUSIONS This study supports the potential utility of different NLP/ML techniques in processing unstructured PROs available in EHRs for clinical care. Though using annotation rules for NLP/ML to analyze unstructured PROs is dominant, deploying novel neural ML-based methods is warranted.
Collapse
Affiliation(s)
- Jin-Ah Sim
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States; School of AI Convergence, Hallym University, Chuncheon, Republic of Korea
| | - Xiaolei Huang
- Department of Computer Science, University of Memphis, Memphis, TN, United States
| | - Madeline R Horan
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States
| | - Christopher M Stewart
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States
| | - Leslie L Robison
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States
| | - Melissa M Hudson
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States; Department of Oncology, St. Jude Children's Research Hospital, Memphis, TN, United States
| | - Justin N Baker
- Department of Pediatrics, Stanford University, Stanford, CA, United States
| | - I-Chan Huang
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States.
| |
Collapse
|
2
|
Moreno-Sánchez PA. Improvement of a prediction model for heart failure survival through explainable artificial intelligence. Front Cardiovasc Med 2023; 10:1219586. [PMID: 37600061 PMCID: PMC10434534 DOI: 10.3389/fcvm.2023.1219586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 07/17/2023] [Indexed: 08/22/2023] Open
Abstract
Cardiovascular diseases and their associated disorder of heart failure (HF) are major causes of death globally, making it a priority for doctors to detect and predict their onset and medical consequences. Artificial Intelligence (AI) allows doctors to discover clinical indicators and enhance their diagnoses and treatments. Specifically, "eXplainable AI" (XAI) offers tools to improve the clinical prediction models that experience poor interpretability of their results. This work presents an explainability analysis and evaluation of two HF survival prediction models using a dataset that includes 299 patients who have experienced HF. The first model utilizes survival analysis, considering death events and time as target features, while the second model approaches the problem as a classification task to predict death. The model employs an optimization data workflow pipeline capable of selecting the best machine learning algorithm as well as the optimal collection of features. Moreover, different post hoc techniques have been used for the explainability analysis of the model. The main contribution of this paper is an explainability-driven approach to select the best HF survival prediction model balancing prediction performance and explainability. Therefore, the most balanced explainable prediction models are Survival Gradient Boosting model for the survival analysis and Random Forest for the classification approach with a c-index of 0.714 and balanced accuracy of 0.74 (std 0.03) respectively. The selection of features by the SCI-XAI in the two models is similar where "serum_creatinine", "ejection_fraction", and "sex" are selected in both approaches, with the addition of "diabetes" for the survival analysis model. Moreover, the application of post hoc XAI techniques also confirm common findings from both approaches by placing the "serum_creatinine" as the most relevant feature for the predicted outcome, followed by "ejection_fraction". The explainable prediction models for HF survival presented in this paper would improve the further adoption of clinical prediction models by providing doctors with insights to better understand the reasoning behind usually "black-box" AI clinical solutions and make more reasonable and data-driven decisions.
Collapse
|
3
|
Obeid JS, Khalifa A, Xavier B, Bou-Daher H, Rockey DC. An AI Approach for Identifying Patients With Cirrhosis. J Clin Gastroenterol 2023; 57:82-88. [PMID: 34238846 PMCID: PMC8741865 DOI: 10.1097/mcg.0000000000001586] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Accepted: 06/05/2021] [Indexed: 02/05/2023]
Abstract
GOAL The goal of this study was to evaluate an artificial intelligence approach, namely deep learning, on clinical text in electronic health records (EHRs) to identify patients with cirrhosis. BACKGROUND AND AIMS Accurate identification of cirrhosis in EHR is important for epidemiological, health services, and outcomes research. Currently, such efforts depend on International Classification of Diseases (ICD) codes, with limited success. MATERIALS AND METHODS We trained several machine learning models using discharge summaries from patients with known cirrhosis from a patient registry and random controls without cirrhosis or its complications based on ICD codes. Models were validated on patients for whom discharge summaries were manually reviewed and used as the gold standard test set. We tested Naive Bayes and Random Forest as baseline models and a deep learning model using word embedding and a convolutional neural network (CNN). RESULTS The training set included 446 cirrhosis patients and 689 controls, while the gold standard test set included 139 cirrhosis patients and 152 controls. Among the machine learning models, the CNN achieved the highest area under the receiver operating characteristic curve (0.993), with a precision of 0.965 and recall of 0.978, compared with 0.879 and 0.981 for the Naive Bayes and Random Forest, respectively (precision 0.787 and 0.958, and recalls 0.878 and 0.827). The precision by ICD codes for cirrhosis was 0.883 and recall was 0.978. CONCLUSIONS A CNN model trained on discharge summaries identified cirrhosis patients with high precision and recall. This approach for phenotyping cirrhosis in the EHR may provide a more accurate assessment of disease burden in a variety of studies.
Collapse
Affiliation(s)
- Jihad S. Obeid
- Department of Public Health Sciences, Medical University of South Carolina, Charleston, South Carolina, USA
| | - Ali Khalifa
- Division of Gastroenterology and Hepatology, Medical University of South Carolina, Charleston, South Carolina, USA
| | - Brandon Xavier
- Division of Gastroenterology and Hepatology, Medical University of South Carolina, Charleston, South Carolina, USA
| | - Halim Bou-Daher
- Division of Gastroenterology and Hepatology, Medical University of South Carolina, Charleston, South Carolina, USA
| | - Don C. Rockey
- Division of Gastroenterology and Hepatology, Medical University of South Carolina, Charleston, South Carolina, USA
- Medical University of South Carolina Digestive Disease Research Center, Medical University of South Carolina, Charleston, South Carolina, USA
| |
Collapse
|
4
|
Design and Implementation of a Comprehensive AI Dashboard for Real-Time Prediction of Adverse Prognosis of ED Patients. Healthcare (Basel) 2022; 10:healthcare10081498. [PMID: 36011155 PMCID: PMC9408009 DOI: 10.3390/healthcare10081498] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 08/02/2022] [Accepted: 08/03/2022] [Indexed: 11/16/2022] Open
Abstract
The emergency department (ED) is at the forefront of medical care, and the medical team needs to make outright judgments and treatment decisions under time constraints. Thus, knowing how to make personalized and precise predictions is a very challenging task. With the advancement of artificial intelligence (AI) technology, Chi Mei Medical Center (CMMC) adopted AI, the Internet of Things (IoT), and interaction technologies to establish diverse prognosis prediction models for eight diseases based on the ED electronic medical records of three branch hospitals. CMMC integrated these predictive models to form a digital AI dashboard, showing the risk status of all ED patients diagnosed with any of these eight diseases. This study first explored the methodology of CMMC’s AI development and proposed a four-tier AI dashboard architecture for ED implementation. The AI dashboard’s ease of use, usefulness, and acceptance was also strongly affirmed by the ED medical staff. The ED AI dashboard is an effective tool in the implementation of real-time risk monitoring of patients in the ED and could improve the quality of care as a part of best practice. Based on the results of this study, it is suggested that healthcare institutions thoughtfully consider tailoring their ED dashboard designs to adapt to their unique workflows and environments.
Collapse
|
5
|
Natural language processing applied to mental illness detection: a narrative review. NPJ Digit Med 2022; 5:46. [PMID: 35396451 PMCID: PMC8993841 DOI: 10.1038/s41746-022-00589-7] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Accepted: 02/23/2022] [Indexed: 11/25/2022] Open
Abstract
Mental illness is highly prevalent nowadays, constituting a major cause of distress in people’s life with impact on society’s health and well-being. Mental illness is a complex multi-factorial disease associated with individual risk factors and a variety of socioeconomic, clinical associations. In order to capture these complex associations expressed in a wide variety of textual data, including social media posts, interviews, and clinical notes, natural language processing (NLP) methods demonstrate promising improvements to empower proactive mental healthcare and assist early diagnosis. We provide a narrative review of mental illness detection using NLP in the past decade, to understand methods, trends, challenges and future directions. A total of 399 studies from 10,467 records were included. The review reveals that there is an upward trend in mental illness detection NLP research. Deep learning methods receive more attention and perform better than traditional machine learning methods. We also provide some recommendations for future studies, including the development of novel detection methods, deep learning paradigms and interpretable models.
Collapse
|
6
|
Fang A, Hu J, Zhao W, Feng M, Fu J, Feng S, Lou P, Ren H, Chen X. Extracting clinical named entity for pituitary adenomas from Chinese electronic medical records. BMC Med Inform Decis Mak 2022; 22:72. [PMID: 35321705 PMCID: PMC8941801 DOI: 10.1186/s12911-022-01810-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 03/14/2022] [Indexed: 11/10/2022] Open
Abstract
OBJECTIVE Pituitary adenomas are the most common type of pituitary disorders, which usually occur in young adults and often affect the patient's physical development, labor capacity and fertility. Clinical free texts noted in electronic medical records (EMRs) of pituitary adenomas patients contain abundant diagnosis and treatment information. However, this information has not been well utilized because of the challenge to extract information from unstructured clinical texts. This study aims to enable machines to intelligently process clinical information, and automatically extract clinical named entity for pituitary adenomas from Chinese EMRs. METHODS The clinical corpus used in this study was from one pituitary adenomas neurosurgery treatment center of a 3A hospital in China. Four types of fine-grained texts of clinical records were selected, which included notes from present illness, past medical history, case characteristics and family history of 500 pituitary adenoma inpatients. The dictionary-based matching, conditional random fields (CRF), bidirectional long short-term memory with CRF (BiLSTM-CRF), and bidirectional encoder representations from transformers with BiLSTM-CRF (BERT-BiLSTM-CRF) were used to extract clinical entities from a Chinese EMRs corpus. A comprehensive dictionary was constructed based on open source vocabularies and a domain dictionary for pituitary adenomas to conduct the dictionary-based matching method. We selected features such as part of speech, radical, document type, and the position of characters to train the CRF-based model. Random character embeddings and the character embeddings pretrained by BERT were used respectively as the input features for the BiLSTM-CRF model and the BERT-BiLSTM-CRF model. Both strict metric and relaxed metric were used to evaluate the performance of these methods. RESULTS Experimental results demonstrated that the deep learning and other machine learning methods were able to automatically extract clinical named entities, including symptoms, body regions, diseases, family histories, surgeries, medications, and disease courses of pituitary adenomas from Chinese EMRs. With regard to overall performance, BERT-BiLSTM-CRF has the highest strict F1 value of 91.27% and the highest relaxed F1 value of 95.57% respectively. Additional evaluations showed that BERT-BiLSTM-CRF performed best in almost all entity recognition except surgery and disease course. BiLSTM-CRF performed best in disease course entity recognition, and performed as well as the CRF model for part of speech, radical and document type features, with both strict and relaxed F1 value reaching 96.48%. The CRF model with part of speech, radical and document type features performed best in surgery entity recognition with relaxed F1 value of 95.29%. CONCLUSIONS In this study, we conducted four entity recognition methods for pituitary adenomas based on Chinese EMRs. It demonstrates that the deep learning methods can effectively extract various types of clinical entities with satisfying performance. This study contributed to the clinical named entity extraction from Chinese neurosurgical EMRs. The findings could also assist in information extraction in other Chinese medical texts.
Collapse
Affiliation(s)
- An Fang
- Life Science College, Central South University, No. 932 South Lushan Road, Changsha, 410083, China.,Institute of Medical Information, Chinese Academy of Medical Sciences, No. 3 Yabao Road, Beijing, 100020, China
| | - Jiahui Hu
- Institute of Medical Information, Chinese Academy of Medical Sciences, No. 3 Yabao Road, Beijing, 100020, China
| | - Wanqing Zhao
- Institute of Medical Information, Chinese Academy of Medical Sciences, No. 3 Yabao Road, Beijing, 100020, China
| | - Ming Feng
- Dongcheng District, Peking Union Medical College Hospital, No. 1 Shuaifuyuan, Beijing, 100730, China
| | - Ji Fu
- Dongcheng District, Peking Union Medical College Hospital, No. 1 Shuaifuyuan, Beijing, 100730, China
| | - Shanshan Feng
- Dongcheng District, Peking Union Medical College Hospital, No. 1 Shuaifuyuan, Beijing, 100730, China
| | - Pei Lou
- Institute of Medical Information, Chinese Academy of Medical Sciences, No. 3 Yabao Road, Beijing, 100020, China
| | - Huiling Ren
- Institute of Medical Information, Chinese Academy of Medical Sciences, No. 3 Yabao Road, Beijing, 100020, China
| | - Xianlai Chen
- Big Data Institute, Central South University, No. 932 South Lushan Road, Changsha, 410083, China. .,National Engineering Lab for Medical Big Data Application Technology, Central South University, No. 932 South Lushan Road, Changsha, 410083, China.
| |
Collapse
|
7
|
Hu W, Wang SY. Predicting Glaucoma Progression Requiring Surgery Using Clinical Free-Text Notes and Transfer Learning With Transformers. Transl Vis Sci Technol 2022; 11:37. [PMID: 35353148 PMCID: PMC8976929 DOI: 10.1167/tvst.11.3.37] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Purpose We evaluated the use of massive transformer-based language models to predict glaucoma progression requiring surgery using ophthalmology clinical notes from electronic health records (EHRs). Methods Ophthalmology clinical notes for 4512 glaucoma patients at a single center from 2008 to 2020 were identified from the EHRs. Four different pre-trained Bidirectional Encoder Representations from Transformers (BERT)-based models were fine-tuned on ophthalmology clinical notes from the patients' first 120 days of follow-up for the task of predicting which patients would require glaucoma surgery. Models were evaluated with standard metrics, including area under the receiver operating characteristic curve (AUROC) and F1 score. Results Of the patients, 748 progressed to require glaucoma surgery (16.6%). The original BERT model had the highest AUROC (73.4%; F1 = 45.0%) for identifying these patients, followed by RoBERTa, with an AUROC of 72.4% (F1 = 44.7%); DistilBERT, with an AUROC of 70.2% (F1 = 42.5%); and BioBERT, with an AUROC of 70.1% (F1 = 41.7%). All models had higher F1 scores than an ophthalmologist's review of clinical notes (F1 = 29.9%). Conclusions Using transfer learning with massively pre-trained BERT-based models is a natural language processing approach that can access the wealth of clinical information stored within ophthalmology clinical notes to predict the progression of glaucoma. Future work to improve model performance can focus on integrating structured or imaging data or further tailoring the BERT models to ophthalmology domain-specific text. Translational Relevance Predictive models can provide the basis for clinical decision support tools to aid clinicians in identifying high- or low-risk patients to maximally tailor glaucoma treatments.
Collapse
Affiliation(s)
- Wendeng Hu
- Byers Eye Institute, Department of Ophthalmology, Stanford University School of Medicine, Palo Alto, CA, USA
| | - Sophia Y Wang
- Byers Eye Institute, Department of Ophthalmology, Stanford University School of Medicine, Palo Alto, CA, USA
| |
Collapse
|
8
|
Mueller B, Kinoshita T, Peebles A, Graber MA, Lee S. Artificial intelligence and machine learning in emergency medicine: a narrative review. Acute Med Surg 2022; 9:e740. [PMID: 35251669 PMCID: PMC8887797 DOI: 10.1002/ams2.740] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Revised: 01/26/2022] [Accepted: 02/06/2022] [Indexed: 12/20/2022] Open
Abstract
AIM The emergence and evolution of artificial intelligence (AI) has generated increasing interest in machine learning applications for health care. Specifically, researchers are grasping the potential of machine learning solutions to enhance the quality of care in emergency medicine. METHODS We undertook a narrative review of published works on machine learning applications in emergency medicine and provide a synopsis of recent developments. RESULTS This review describes fundamental concepts of machine learning and presents clinical applications for triage, risk stratification specific to disease, medical imaging, and emergency department operations. Additionally, we consider how machine learning models could contribute to the improvement of causal inference in medicine, and to conclude, we discuss barriers to safe implementation of AI. CONCLUSION We intend that this review serves as an introduction to AI and machine learning in emergency medicine.
Collapse
Affiliation(s)
- Brianna Mueller
- Department of Business Analytics The University of Iowa Tippie College of Business Iowa City Iowa USA
| | | | - Alexander Peebles
- Department of Emergency Medicine The University of Iowa Carver College of Medicine Iowa City Iowa USA
| | - Mark A Graber
- Department of Emergency Medicine The University of Iowa Carver College of Medicine Iowa City Iowa USA
| | - Sangil Lee
- Department of Emergency Medicine The University of Iowa Carver College of Medicine Iowa City Iowa USA
| |
Collapse
|
9
|
Schwartz JM, Moy AJ, Rossetti SC, Elhadad N, Cato KD. Clinician involvement in research on machine learning-based predictive clinical decision support for the hospital setting: A scoping review. J Am Med Inform Assoc 2021; 28:653-663. [PMID: 33325504 PMCID: PMC7936403 DOI: 10.1093/jamia/ocaa296] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Accepted: 11/30/2020] [Indexed: 01/03/2023] Open
Abstract
OBJECTIVE The study sought to describe the prevalence and nature of clinical expert involvement in the development, evaluation, and implementation of clinical decision support systems (CDSSs) that utilize machine learning to analyze electronic health record data to assist nurses and physicians in prognostic and treatment decision making (ie, predictive CDSSs) in the hospital. MATERIALS AND METHODS A systematic search of PubMed, CINAHL, and IEEE Xplore and hand-searching of relevant conference proceedings were conducted to identify eligible articles. Empirical studies of predictive CDSSs using electronic health record data for nurses or physicians in the hospital setting published in the last 5 years in peer-reviewed journals or conference proceedings were eligible for synthesis. Data from eligible studies regarding clinician involvement, stage in system design, predictive CDSS intention, and target clinician were charted and summarized. RESULTS Eighty studies met eligibility criteria. Clinical expert involvement was most prevalent at the beginning and late stages of system design. Most articles (95%) described developing and evaluating machine learning models, 28% of which described involving clinical experts, with nearly half functioning to verify the clinical correctness or relevance of the model (47%). DISCUSSION Involvement of clinical experts in predictive CDSS design should be explicitly reported in publications and evaluated for the potential to overcome predictive CDSS adoption challenges. CONCLUSIONS If present, clinical expert involvement is most prevalent when predictive CDSS specifications are made or when system implementations are evaluated. However, clinical experts are less prevalent in developmental stages to verify clinical correctness, select model features, preprocess data, or serve as a gold standard.
Collapse
Affiliation(s)
| | - Amanda J Moy
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Sarah C Rossetti
- School of Nursing, Columbia University, New York, New York, USA
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Noémie Elhadad
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Kenrick D Cato
- School of Nursing, Columbia University, New York, New York, USA
- Department of Emergency Medicine, Columbia University, New York, New York, USA
| |
Collapse
|
10
|
Abstract
Machine learning (ML) has been slowly entering every aspect of our lives and its positive impact has been astonishing. To accelerate embedding ML in more applications and incorporating it in real-world scenarios, automated machine learning (AutoML) is emerging. The main purpose of AutoML is to provide seamless integration of ML in various industries, which will facilitate better outcomes in everyday tasks. In healthcare, AutoML has been already applied to easier settings with structured data such as tabular lab data. However, there is still a need for applying AutoML for interpreting medical text, which is being generated at a tremendous rate. For this to happen, a promising method is AutoML for clinical notes analysis, which is an unexplored research area representing a gap in ML research. The main objective of this paper is to fill this gap and provide a comprehensive survey and analytical study towards AutoML for clinical notes. To that end, we first introduce the AutoML technology and review its various tools and techniques. We then survey the literature of AutoML in the healthcare industry and discuss the developments specific to clinical settings, as well as those using general AutoML tools for healthcare applications. With this background, we then discuss challenges of working with clinical notes and highlight the benefits of developing AutoML for medical notes processing. Next, we survey relevant ML research for clinical notes and analyze the literature and the field of AutoML in the healthcare industry. Furthermore, we propose future research directions and shed light on the challenges and opportunities this emerging field holds. With this, we aim to assist the community with the implementation of an AutoML platform for medical notes, which if realized can revolutionize patient outcomes.
Collapse
|
11
|
Jacobucci R, Ammerman BA, Tyler Wilcox K. The use of text-based responses to improve our understanding and prediction of suicide risk. Suicide Life Threat Behav 2021; 51:55-64. [PMID: 33624877 DOI: 10.1111/sltb.12668] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
OBJECTIVE Text-based responses may provide significant contributions to suicide risk prediction, yet research including text data is limited. This may be due to a lack of exposure and familiarity with statistical analyses for this data structure. METHOD The current study provides an overview of data processing and statistical algorithms for text data, guided by an empirical example of 947 online participants who completed both open-ended items and traditional self-report measures. We give an introduction to a number of text-based statistical approaches, including dictionary-based methods, topic modeling, word embeddings, and deep learning. RESULTS We analyze responses from the open-ended question "How do you feel today?", detailing characteristics of the responses, as well as predicting past-year suicidal ideation. CONCLUSIONS We see the analysis of text from social media, open-ended questions, and other text sources (i.e., medical records) as an important form of complementary assessment to traditional scales, shedding insight on what we are missing in our current set of questionnaires, which may ultimately serve to improve both our understanding and prediction of suicide.
Collapse
Affiliation(s)
- Ross Jacobucci
- Department of Psychology, University of Notre Dame, Notre Dame, Indiana
| | - Brooke A Ammerman
- Department of Psychology, University of Notre Dame, Notre Dame, Indiana
| | | |
Collapse
|
12
|
Tang KJW, Ang CKE, Constantinides T, Rajinikanth V, Acharya UR, Cheong KH. Artificial Intelligence and Machine Learning in Emergency Medicine. Biocybern Biomed Eng 2021. [DOI: 10.1016/j.bbe.2020.12.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
13
|
An intelligent multimodal medical diagnosis system based on patients’ medical questions and structured symptoms for telemedicine. INFORMATICS IN MEDICINE UNLOCKED 2021. [DOI: 10.1016/j.imu.2021.100513] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
|
14
|
Obeid JS, Davis M, Turner M, Meystre SM, Heider PM, O'Bryan EC, Lenert LA. An artificial intelligence approach to COVID-19 infection risk assessment in virtual visits: A case report. J Am Med Inform Assoc 2020; 27:1321-1325. [PMID: 32449766 PMCID: PMC7313981 DOI: 10.1093/jamia/ocaa105] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Revised: 05/07/2020] [Accepted: 05/21/2020] [Indexed: 12/15/2022] Open
Abstract
Objective In an effort to improve the efficiency of computer algorithms applied to screening for coronavirus disease 2019 (COVID-19) testing, we used natural language processing and artificial intelligence–based methods with unstructured patient data collected through telehealth visits. Materials and Methods After segmenting and parsing documents, we conducted analysis of overrepresented words in patient symptoms. We then developed a word embedding–based convolutional neural network for predicting COVID-19 test results based on patients’ self-reported symptoms. Results Text analytics revealed that concepts such as smell and taste were more prevalent than expected in patients testing positive. As a result, screening algorithms were adapted to include these symptoms. The deep learning model yielded an area under the receiver-operating characteristic curve of 0.729 for predicting positive results and was subsequently applied to prioritize testing appointment scheduling. Conclusions Informatics tools such as natural language processing and artificial intelligence methods can have significant clinical impacts when applied to data streams early in the development of clinical systems for outbreak response.
Collapse
Affiliation(s)
- Jihad S Obeid
- Department of Public Health Sciences, Medical University of South Carolina, Charleston, South Carolina, USA.,Biomedical Informatics Center, Medical University of South Carolina, Charleston, South Carolina, USA
| | - Matthew Davis
- Information Solutions, Medical University of South Carolina, Charleston, South Carolina, USA
| | - Matthew Turner
- Information Solutions, Medical University of South Carolina, Charleston, South Carolina, USA
| | - Stephane M Meystre
- Biomedical Informatics Center, Medical University of South Carolina, Charleston, South Carolina, USA.,Department of Psychiatry and Behavioral Sciences, Medical University of South Carolina, Charleston, South Carolina, USA
| | - Paul M Heider
- Biomedical Informatics Center, Medical University of South Carolina, Charleston, South Carolina, USA
| | - Edward C O'Bryan
- Department of Emergency Medicine, Medical University of South Carolina, Charleston, South Carolina, USA
| | - Leslie A Lenert
- Biomedical Informatics Center, Medical University of South Carolina, Charleston, South Carolina, USA.,Department of Medicine, Medical University of South Carolina, Charleston, South Carolina, USA
| |
Collapse
|
15
|
Sung SF, Lin CY, Hu YH. EMR-Based Phenotyping of Ischemic Stroke Using Supervised Machine Learning and Text Mining Techniques. IEEE J Biomed Health Inform 2020; 24:2922-2931. [DOI: 10.1109/jbhi.2020.2976931] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
16
|
Moen H, Hakala K, Peltonen LM, Matinolli HM, Suhonen H, Terho K, Danielsson-Ojala R, Valta M, Ginter F, Salakoski T, Salanterä S. Assisting nurses in care documentation: from automated sentence classification to coherent document structures with subject headings. J Biomed Semantics 2020; 11:10. [PMID: 32873340 PMCID: PMC7465411 DOI: 10.1186/s13326-020-00229-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2019] [Accepted: 08/14/2020] [Indexed: 11/10/2022] Open
Abstract
Background Up to 35% of nurses’ working time is spent on care documentation. We describe the evaluation of a system aimed at assisting nurses in documenting patient care and potentially reducing the documentation workload. Our goal is to enable nurses to write or dictate nursing notes in a narrative manner without having to manually structure their text under subject headings. In the current care classification standard used in the targeted hospital, there are more than 500 subject headings to choose from, making it challenging and time consuming for nurses to use. Methods The task of the presented system is to automatically group sentences into paragraphs and assign subject headings. For classification the system relies on a neural network-based text classification model. The nursing notes are initially classified on sentence level. Subsequently coherent paragraphs are constructed from related sentences. Results Based on a manual evaluation conducted by a group of three domain experts, we find that in about 69% of the paragraphs formed by the system the topics of the sentences are coherent and the assigned paragraph headings correctly describe the topics. We also show that the use of a paragraph merging step reduces the number of paragraphs produced by 23% without affecting the performance of the system. Conclusions The study shows that the presented system produces a coherent and logical structure for freely written nursing narratives and has the potential to reduce the time and effort nurses are currently spending on documenting care in hospitals.
Collapse
Affiliation(s)
- Hans Moen
- Department of Future Technologies, University of Turku, Vesilinnantie 5, Turku, 20500, Finland.
| | - Kai Hakala
- Department of Future Technologies, University of Turku, Vesilinnantie 5, Turku, 20500, Finland.,University of Turku Graduate School, University of Turku, Hämeenkatu 4, Turku, 20500, Finland
| | - Laura-Maria Peltonen
- Department of Nursing Science, University of Turku, Joukahaisenkatu 3-5, Turku, 20520, Finland
| | - Hanna-Maria Matinolli
- Department of Nursing Science, University of Turku, Joukahaisenkatu 3-5, Turku, 20520, Finland
| | - Henry Suhonen
- Department of Nursing Science, University of Turku, Joukahaisenkatu 3-5, Turku, 20520, Finland.,Turku University Hospital, Kiinamyllynkatu 4-8, Turku, 20521, Finland
| | - Kirsi Terho
- Department of Nursing Science, University of Turku, Joukahaisenkatu 3-5, Turku, 20520, Finland.,Turku University Hospital, Kiinamyllynkatu 4-8, Turku, 20521, Finland
| | - Riitta Danielsson-Ojala
- Department of Nursing Science, University of Turku, Joukahaisenkatu 3-5, Turku, 20520, Finland.,Turku University Hospital, Kiinamyllynkatu 4-8, Turku, 20521, Finland
| | - Maija Valta
- Turku University Hospital, Kiinamyllynkatu 4-8, Turku, 20521, Finland
| | - Filip Ginter
- Department of Future Technologies, University of Turku, Vesilinnantie 5, Turku, 20500, Finland
| | - Tapio Salakoski
- Department of Future Technologies, University of Turku, Vesilinnantie 5, Turku, 20500, Finland
| | - Sanna Salanterä
- Department of Nursing Science, University of Turku, Joukahaisenkatu 3-5, Turku, 20520, Finland.,Turku University Hospital, Kiinamyllynkatu 4-8, Turku, 20521, Finland
| |
Collapse
|
17
|
Obeid JS, Dahne J, Christensen S, Howard S, Crawford T, Frey LJ, Stecker T, Bunnell BE. Identifying and Predicting Intentional Self-Harm in Electronic Health Record Clinical Notes: Deep Learning Approach. JMIR Med Inform 2020; 8:e17784. [PMID: 32729840 PMCID: PMC7426805 DOI: 10.2196/17784] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2020] [Revised: 04/25/2020] [Accepted: 05/21/2020] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Suicide is an important public health concern in the United States and around the world. There has been significant work examining machine learning approaches to identify and predict intentional self-harm and suicide using existing data sets. With recent advances in computing, deep learning applications in health care are gaining momentum. OBJECTIVE This study aimed to leverage the information in clinical notes using deep neural networks (DNNs) to (1) improve the identification of patients treated for intentional self-harm and (2) predict future self-harm events. METHODS We extracted clinical text notes from electronic health records (EHRs) of 835 patients with International Classification of Diseases (ICD) codes for intentional self-harm and 1670 matched controls who never had any intentional self-harm ICD codes. The data were divided into training and holdout test sets. We tested a number of algorithms on clinical notes associated with the intentional self-harm codes using the training set, including several traditional bag-of-words-based models and 2 DNN models: a convolutional neural network (CNN) and a long short-term memory model. We also evaluated the predictive performance of the DNNs on a subset of patients who had clinical notes 1 to 6 months before the first intentional self-harm event. Finally, we evaluated the impact of a pretrained model using Word2vec (W2V) on performance. RESULTS The area under the receiver operating characteristic curve (AUC) for the CNN on the phenotyping task, that is, the detection of intentional self-harm in clinical notes concurrent with the events was 0.999, with an F1 score of 0.985. In the predictive task, the CNN achieved the highest performance with an AUC of 0.882 and an F1 score of 0.769. Although pretraining with W2V shortened the DNN training time, it did not improve performance. CONCLUSIONS The strong performance on the first task, namely, phenotyping based on clinical notes, suggests that such models could be used effectively for surveillance of intentional self-harm in clinical text in an EHR. The modest performance on the predictive task notwithstanding, the results using DNN models on clinical text alone are competitive with other reports in the literature using risk factors from structured EHR data.
Collapse
Affiliation(s)
- Jihad S Obeid
- Medical University of South Carolina, Charleston, SC, United States
| | - Jennifer Dahne
- Medical University of South Carolina, Charleston, SC, United States
| | - Sean Christensen
- Medical University of South Carolina, Charleston, SC, United States
| | - Samuel Howard
- Medical University of South Carolina, Charleston, SC, United States
| | - Tami Crawford
- Medical University of South Carolina, Charleston, SC, United States
| | - Lewis J Frey
- Medical University of South Carolina, Charleston, SC, United States
| | - Tracy Stecker
- Medical University of South Carolina, Charleston, SC, United States
| | | |
Collapse
|