Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Jagannatha AN, Yu H. Bidirectional RNN for Medical Event Detection in Electronic Health Records. Proc Conf 2016;2016:473-82. [PMID: 27885364 DOI: 10.18653/v1/n16-1056] [Citation(s) in RCA: 99] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Number

Cited by Other Article(s)

Meier TA, Refahi MS, Hearne G, Restifo DS, Munoz-Acuna R, Rosen GL, Woloszynek S. The Role and Applications of Artificial Intelligence in the Treatment of Chronic Pain. Curr Pain Headache Rep 2024;28:769-784. [PMID: 38822995 DOI: 10.1007/s11916-024-01264-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/28/2024] [Indexed: 06/03/2024]

Yang J, Hu Z, Zhang L, Peng B. Predicting Drugs Suspected of Causing Adverse Drug Reactions Using Graph Features and Attention Mechanisms. Pharmaceuticals (Basel) 2024;17:822. [PMID: 39065673 PMCID: PMC11279999 DOI: 10.3390/ph17070822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Revised: 06/12/2024] [Accepted: 06/20/2024] [Indexed: 07/28/2024] Open

Deimazar G, Sheikhtaheri A. Machine learning models to detect and predict patient safety events using electronic health records: A systematic review. Int J Med Inform 2023;180:105246. [PMID: 37837710 DOI: 10.1016/j.ijmedinf.2023.105246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Revised: 10/02/2023] [Accepted: 10/08/2023] [Indexed: 10/16/2023]

Abstract

INTRODUCTION

Identifying patient safety events using electronic health records (EHRs) and automated machine learning-based detection methods can help improve the efficiency and quality of healthcare service provision.

OBJECTIVE

This study aimed to systematically review machine learning-based methods and techniques, as well as their results for patient safety event management using EHRs.

METHODS

We reviewed the studies that focused on machine learning techniques, including automatic prediction and detection of patient safety events and medical errors through EHR analysis to manage patient safety events. The data were collected by searching Scopus, PubMed (Medline), Web of Science, EMBASE, and IEEE Xplore databases.

RESULTS

After screening, 41 papers were reviewed. Support vector machine (SVM), random forest, conditional random field (CRF), and bidirectional long short-term memory with conditional random field (BiLSTM-CRF) algorithms were mostly applied to predict, identify, and classify patient safety events using EHRs; however, they had different performances. BiLSTM-CRF was employed in most of the studies to extract and identify concepts, e.g., adverse drug events (ADEs) and adverse drug reactions (ADRs), as well as relationships between drug and severity, drug and ADEs, drug and ADRs. Recurrent neural networks (RNN) and BiLSTM-CRF had the best results in detecting ADEs compared to other patient safety events. Linear classifiers and Naive Bayes (NB) had the highest performance for ADR detection. Logistic regression had the best results in detecting surgical site infections. According to the findings, the quality of articles has non-significantly improved in recent years, but they had low average scores.

CONCLUSIONS

Machine learning can be useful in automatic detection and prediction of patient safety events. However, most of these algorithms have not yet been externally validated or prospectively tested. Therefore, further studies are required to improve the performance of these automated systems.

Collapse

Ahmad PN, Liu Y, Khan K, Jiang T, Burhan U. BIR: Biomedical Information Retrieval System for Cancer Treatment in Electronic Health Record Using Transformers. SENSORS (BASEL, SWITZERLAND) 2023;23:9355. [PMID: 38067736 PMCID: PMC10708614 DOI: 10.3390/s23239355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 10/25/2023] [Accepted: 10/29/2023] [Indexed: 12/18/2023]

Adverse drug event detection using natural language processing: A scoping review of supervised learning methods. PLoS One 2023;18:e0279842. [PMID: 36595517 PMCID: PMC9810201 DOI: 10.1371/journal.pone.0279842] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 12/15/2022] [Indexed: 01/04/2023] Open

iADRGSE: A Graph-Embedding and Self-Attention Encoding for Identifying Adverse Drug Reaction in the Earlier Phase of Drug Development. Int J Mol Sci 2022;23:ijms232416216. [PMID: 36555858 PMCID: PMC9786008 DOI: 10.3390/ijms232416216] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 12/15/2022] [Accepted: 12/16/2022] [Indexed: 12/23/2022] Open

Han F, Zhang Z, Zhang H, Nakaya J, Kudo K, Ogasawara K. Extraction and Quantification of Words Representing Degrees of Diseases: Combining the Fuzzy C-Means Method and Gaussian Membership. JMIR Form Res 2022;6:e38677. [DOI: 10.2196/38677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 09/29/2022] [Accepted: 10/24/2022] [Indexed: 11/19/2022] Open

Abstract Background Due to the development of medical data, a large amount of clinical data has been generated. These unstructured data contain substantial information. Extracting useful knowledge from this data and making scientific decisions for diagnosing and treating diseases have become increasingly necessary. Unstructured data, such as in the Marketplace for Medical Information in Intensive Care III (MIMIC-III) data set, contain several ambiguous words that demonstrate the subjectivity of doctors, such as descriptions of patient symptoms. These data could be used to further improve the accuracy of medical diagnostic system assessments. To the best of our knowledge, there is currently no method for extracting subjective words that express the extent of these symptoms (hereinafter, “degree words”). Objective Therefore, we propose using the fuzzy c-means (FCM) method and Gaussian membership to quantify the degree words in the clinical medical data set MIMIC-III. Methods First, we preprocessed the 381,091 radiology reports collected in MIMIC-III, and then we used the FCM method to extract degree words from unstructured text. Thereafter, we used the Gaussian membership method to quantify the extracted degree words, which transform the fuzzy words extracted from the medical text into computer-recognizable numbers. Results The results showed that the digitization of ambiguous words in medical texts is feasible. The words representing each degree of each disease had a range of corresponding values. Examples of membership medians were 2.971 (atelectasis), 3.121 (pneumonia), 2.899 (pneumothorax), 3.051 (pulmonary edema), and 2.435 (pulmonary embolus). Additionally, all extracted words contained the same subjective words (low, high, etc), which allows for an objective evaluation method. Furthermore, we will verify the specific impact of the quantification results of ambiguous words such as symptom words and degree words on the use of medical texts in subsequent studies. These same ambiguous words may be used as a new set of feature values to represent the disorders. Conclusions This study proposes an innovative method for handling subjective words. We used the FCM method to extract the subjective degree words in the English-interpreted report of the MIMIC-III and then used the Gaussian functions to quantify the subjective degree words. In this method, words containing subjectivity in unstructured texts can be automatically processed and transformed into numerical ranges by digital processing. It was concluded that the digitization of ambiguous words in medical texts is feasible. Collapse

Salas M, Petracek J, Yalamanchili P, Aimer O, Kasthuril D, Dhingra S, Junaid T, Bostic T. The Use of Artificial Intelligence in Pharmacovigilance: A Systematic Review of the Literature. Pharmaceut Med 2022;36:295-306. [PMID: 35904529 DOI: 10.1007/s40290-022-00441-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/06/2022] [Indexed: 11/25/2022]

Abstract

INTRODUCTION

Artificial intelligence through machine learning uses algorithms and prior learnings to make predictions. Recently, there has been interest to include more artificial intelligence in pharmacovigilance of products already in the market and pharmaceuticals in development.

OBJECTIVE

The aim of this study was to identify and describe the uses of artificial intelligence in pharmacovigilance through a systematic literature review.

METHODS

Embase and MEDLINE database searches were conducted for articles published from January 1, 2015 to July 9, 2021 using search terms such as 'pharmacovigilance,' 'patient safety,' 'artificial intelligence,' and 'machine learning' in the title or abstract. Scientific articles that contained information on the use of artificial intelligence in all modalities of patient safety or pharmacovigilance were reviewed and synthesized using a pre-specified data extraction template. Articles with incomplete information and letters to editor, notes, and commentaries were excluded.

RESULTS

Sixty-six articles were identified for evaluation. Most relevant articles on artificial intelligence focused on machine learning, and it was used in patient safety in the identification of adverse drug events (ADEs) and adverse drug reactions (ADRs) (57.6%), processing safety reports (21.2%), extraction of drug-drug interactions (7.6%), identification of populations at high risk for drug toxicity or guidance for personalized care (7.6%), prediction of side effects (3.0%), simulation of clinical trials (1.5%), and integration of prediction uncertainties into diagnostic classifiers to increase patient safety (1.5%). Artificial intelligence has been used to identify safety signals through automated processes and training with machine learning models; however, the findings may not be generalizable given that there were different types of data included in each source.

CONCLUSION

Artificial intelligence allows for the processing and analysis of large amounts of data and can be applied to various disease states. The automation and machine learning models can optimize pharmacovigilance processes and provide a more efficient way to analyze information relevant to safety, although more research is needed to identify if this optimization has an impact on the quality of safety analyses. It is expected that its use will increase in the near future, particularly with its role in the prediction of side effects and ADRs.

Collapse

A Novel Encoder-Decoder Model for Multivariate Time Series Forecasting. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022;2022:5596676. [PMID: 35463259 PMCID: PMC9023224 DOI: 10.1155/2022/5596676] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Revised: 03/26/2022] [Accepted: 03/28/2022] [Indexed: 11/29/2022]

Richter-Pechanski P, Geis NA, Kiriakou C, Schwab DM, Dieterich C. Automatic extraction of 12 cardiovascular concepts from German discharge letters using pre-trained language models. Digit Health 2021;7:20552076211057662. [PMID: 34868618 PMCID: PMC8637713 DOI: 10.1177/20552076211057662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Accepted: 10/15/2021] [Indexed: 11/17/2022] Open

A spatiotemporal multi-feature extraction framework for opinion mining. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.11.098] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Zhang H, Zhang J, Ni W, Jiang Y, Liu K, Sun D, Li J. Transformer + GAN based Traditional Chinese Medicine inpatient prescription recommendation (Preprint). JMIR Med Inform 2021;10:e35239. [PMID: 35639469 PMCID: PMC9198826 DOI: 10.2196/35239] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2021] [Revised: 03/05/2022] [Accepted: 04/11/2022] [Indexed: 11/30/2022] Open

Abstract

Background

Traditional Chinese medicine (TCM) practitioners usually follow a 4-step evaluation process during patient diagnosis: observation, auscultation, olfaction, inquiry, pulse feeling, and palpation. The information gathered in this process, along with laboratory test results and other measurements such as vital signs, is recorded in the patient’s electronic health record (EHR). In fact, all the information needed to make a treatment plan is contained in the EHR; however, only a seasoned TCM physician could use this information well to make a good treatment plan as the reasoning process is very complicated, and it takes years of practice for a medical graduate to master the reasoning skill. In this digital medicine era, with a deluge of medical data, ever-increasing computing power, and more advanced artificial neural network models, it is not only desirable but also readily possible for a computerized system to mimic the decision-making process of a TCM physician.

Objective

This study aims to develop an assistive tool that can predict prescriptions for inpatients in a hospital based on patients’ clinical EHRs.

Methods

Clinical health records containing medical histories, as well as current symptoms and diagnosis information, were used to train a transformer-based neural network model using the corresponding physician’s prescriptions as the target. This was accomplished by extracting relevant information, such as the patient’s current illness, medicines taken, nursing care given, vital signs, examinations, and laboratory results from the patient’s EHRs. The obtained information was then sorted chronologically to produce a sequence of data for the patient. These time sequence data were then used as input to a modified transformer network, which was chosen as a prescription prediction model. The output of the model was the prescription for the patient. The ultimate goal is for this tool to generate a prescription that matches what an expert TCM physician would prescribe. To alleviate the issue of overfitting, a generative adversarial network was used to augment the training sample data set by generating noise-added samples from the original training samples.

Results

In total, 21,295 copies of inpatient electronic medical records from Guang’anmen Hospital were used in this study. These records were generated between January 2017 and December 2018, covering 6352 types of medicines. These medicines were sorted into 819 types of first-category medicines based on their class relationships. As shown by the test results, the performance of a fully trained transformer model can have an average precision rate of 80.58% and an average recall rate of 68.49%.

Conclusions

As shown by the preliminary test results, the transformer-based TCM prescription recommendation model outperformed the existing conventional methods. The extra training samples generated by the generative adversarial network help to overcome the overfitting issue, leading to further improved recall and precision rates.

Collapse

Lee K, Kayaalp M, Henry S, Uzuner Ö. A Context-Enhanced De-identification System. ACM TRANSACTIONS ON COMPUTING FOR HEALTHCARE 2021;3. [PMID: 34676376 DOI: 10.1145/3470980] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Prabhakar SK, Won DO. Medical Text Classification Using Hybrid Deep Learning Models with Multihead Attention. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021;2021:9425655. [PMID: 34603437 PMCID: PMC8486521 DOI: 10.1155/2021/9425655] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Accepted: 08/31/2021] [Indexed: 11/18/2022]

Estiri H, Strasser ZH, Murphy SN. High-throughput phenotyping with temporal sequences. J Am Med Inform Assoc 2021;28:772-781. [PMID: 33313899 DOI: 10.1093/jamia/ocaa288] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Accepted: 11/04/2020] [Indexed: 12/15/2022] Open

Abstract

OBJECTIVE

High-throughput electronic phenotyping algorithms can accelerate translational research using data from electronic health record (EHR) systems. The temporal information buried in EHRs is often underutilized in developing computational phenotypic definitions. This study aims to develop a high-throughput phenotyping method, leveraging temporal sequential patterns from EHRs.

MATERIALS AND METHODS

We develop a representation mining algorithm to extract 5 classes of representations from EHR diagnosis and medication records: the aggregated vector of the records (aggregated vector representation), the standard sequential patterns (sequential pattern mining), the transitive sequential patterns (transitive sequential pattern mining), and 2 hybrid classes. Using EHR data on 10 phenotypes from the Mass General Brigham Biobank, we train and validate phenotyping algorithms.

RESULTS

Phenotyping with temporal sequences resulted in a superior classification performance across all 10 phenotypes compared with the standard representations in electronic phenotyping. The high-throughput algorithm's classification performance was superior or similar to the performance of previously published electronic phenotyping algorithms. We characterize and evaluate the top transitive sequences of diagnosis records paired with the records of risk factors, symptoms, complications, medications, or vaccinations.

DISCUSSION

The proposed high-throughput phenotyping approach enables seamless discovery of sequential record combinations that may be difficult to assume from raw EHR data. Transitive sequences offer more accurate characterization of the phenotype, compared with its individual components, and reflect the actual lived experiences of the patients with that particular disease.

CONCLUSION

Sequential data representations provide a precise mechanism for incorporating raw EHR records into downstream machine learning. Our approach starts with user interpretability and works backward to the technology.

Collapse

Automatic Prediction of Recurrence of Major Cardiovascular Events: A Text Mining Study Using Chest X-Ray Reports. JOURNAL OF HEALTHCARE ENGINEERING 2021;2021:6663884. [PMID: 34306597 PMCID: PMC8285182 DOI: 10.1155/2021/6663884] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Revised: 05/29/2021] [Accepted: 06/29/2021] [Indexed: 11/17/2022]

Abstract

Methods

We used EHR data of patients included in the Second Manifestations of ARTerial disease (SMART) study. We propose a deep learning-based multimodal architecture for our text mining pipeline that integrates neural text representation with preprocessed clinical predictors for the prediction of recurrence of major cardiovascular events in cardiovascular patients. Text preprocessing, including cleaning and stemming, was first applied to filter out the unwanted texts from X-ray radiology reports. Thereafter, text representation methods were used to numerically represent unstructured radiology reports with vectors. Subsequently, these text representation methods were added to prediction models to assess their clinical relevance. In this step, we applied logistic regression, support vector machine (SVM), multilayer perceptron neural network, convolutional neural network, long short-term memory (LSTM), and bidirectional LSTM deep neural network (BiLSTM).

Results

We performed various experiments to evaluate the added value of the text in the prediction of major cardiovascular events. The two main scenarios were the integration of radiology reports (1) with classical clinical predictors and (2) with only age and sex in the case of unavailable clinical predictors. In total, data of 5603 patients were used with 5-fold cross-validation to train the models. In the first scenario, the multimodal BiLSTM (MI-BiLSTM) model achieved an area under the curve (AUC) of 84.7%, misclassification rate of 14.3%, and F1 score of 83.8%. In this scenario, the SVM model, trained on clinical variables and bag-of-words representation, achieved the lowest misclassification rate of 12.2%. In the case of unavailable clinical predictors, the MI-BiLSTM model trained on radiology reports and demographic (age and sex) variables reached an AUC, F1 score, and misclassification rate of 74.5%, 70.8%, and 20.4%, respectively.

Conclusions

Using the case study of routine care chest X-ray radiology reports, we demonstrated the clinical relevance of integrating text features and classical predictors in our text mining pipeline for cardiovascular risk prediction. The MI-BiLSTM model with word embedding representation appeared to have a desirable performance when trained on text data integrated with the clinical variables from the SMART study. Our results mined from chest X-ray reports showed that models using text data in addition to laboratory values outperform those using only known clinical predictors.

Collapse

A hybrid medical text classification framework: Integrating attentive rule construction and neural network. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.02.069] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Improved machine learning performances with transfer learning to predicting need for hospitalization in arboviral infections against the small dataset. Neural Comput Appl 2021;33:14975-14989. [PMID: 34092929 PMCID: PMC8169423 DOI: 10.1007/s00521-021-06133-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Accepted: 05/15/2021] [Indexed: 12/11/2022]

Lee H, Kang J, Yeo J. Medical Specialty Recommendations by an Artificial Intelligence Chatbot on a Smartphone: Development and Deployment. J Med Internet Res 2021;23:e27460. [PMID: 33882012 PMCID: PMC8104000 DOI: 10.2196/27460] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Revised: 03/03/2021] [Accepted: 04/17/2021] [Indexed: 01/22/2023] Open

Abstract

Background

The COVID-19 pandemic has limited daily activities and even contact between patients and primary care providers. This makes it more difficult to provide adequate primary care services, which include connecting patients to an appropriate medical specialist. A smartphone-compatible artificial intelligence (AI) chatbot that classifies patients’ symptoms and recommends the appropriate medical specialty could provide a valuable solution.

Objective

In order to establish a contactless method of recommending the appropriate medical specialty, this study aimed to construct a deep learning–based natural language processing (NLP) pipeline and to develop an AI chatbot that can be used on a smartphone.

Methods

We collected 118,008 sentences containing information on symptoms with labels (medical specialty), conducted data cleansing, and finally constructed a pipeline of 51,134 sentences for this study. Several deep learning models, including 4 different long short-term memory (LSTM) models with or without attention and with or without a pretrained FastText embedding layer, as well as bidirectional encoder representations from transformers for NLP, were trained and validated using a randomly selected test data set. The performance of the models was evaluated on the basis of the precision, recall, F₁-score, and area under the receiver operating characteristic curve (AUC). An AI chatbot was also designed to make it easy for patients to use this specialty recommendation system. We used an open-source framework called “Alpha” to develop our AI chatbot. This takes the form of a web-based app with a frontend chat interface capable of conversing in text and a backend cloud-based server application to handle data collection, process the data with a deep learning model, and offer the medical specialty recommendation in a responsive web that is compatible with both desktops and smartphones.

Results

The bidirectional encoder representations from transformers model yielded the best performance, with an AUC of 0.964 and F₁-score of 0.768, followed by LSTM model with embedding vectors, with an AUC of 0.965 and F₁-score of 0.739. Considering the limitations of computing resources and the wide availability of smartphones, the LSTM model with embedding vectors trained on our data set was adopted for our AI chatbot service. We also deployed an Alpha version of the AI chatbot to be executed on both desktops and smartphones.

Conclusions

With the increasing need for telemedicine during the current COVID-19 pandemic, an AI chatbot with a deep learning–based NLP model that can recommend a medical specialty to patients through their smartphones would be exceedingly useful. This chatbot allows patients to identify the proper medical specialist in a rapid and contactless manner, based on their symptoms, thus potentially supporting both patients and primary care providers.

Collapse

Predicting mortality and hospitalization in heart failure using machine learning: A systematic literature review. IJC HEART & VASCULATURE 2021;34:100773. [PMID: 33912652 PMCID: PMC8065274 DOI: 10.1016/j.ijcha.2021.100773] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2021] [Revised: 03/11/2021] [Accepted: 03/23/2021] [Indexed: 12/13/2022]

Ruan X, Li Y, Jin X, Deng P, Xu J, Li N, Li X, Liu Y, Hu Y, Xie J, Wu Y, Long D, He W, Yuan D, Guo Y, Li H, Huang H, Yang S, Han M, Zhuang B, Qian J, Cao Z, Zhang X, Xiao J, Xu L. Health-adjusted life expectancy (HALE) in Chongqing, China, 2017: An artificial intelligence and big data method estimating the burden of disease at city level. THE LANCET REGIONAL HEALTH. WESTERN PACIFIC 2021;9:100110. [PMID: 34379708 PMCID: PMC8315391 DOI: 10.1016/j.lanwpc.2021.100110] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Revised: 01/25/2021] [Accepted: 02/03/2021] [Indexed: 01/08/2023]

Abstract

BACKGROUND

A universally applicable approach that provides standard HALE measurements for different regions has yet to be developed because of the difficulties of health information collection. In this study, we developed a natural language processing (NLP) based HALE estimation approach by using individual-level electronic medical records (EMRs), which made it possible to calculate HALE timely in different temporal or spatial granularities.

METHODS

We performed diagnostic concept extraction and normalisation on 13•99 million EMRs with NLP to estimate the prevalence of 254 diseases in WHO Global Burden of Disease Study (GBD). Then, we calculated HALE in Chongqing, 2017, by using the life table technique and Sullivan's method, and analysed the contribution of diseases to the expected years "lost" due to disability (DLE).

FINDINGS

Our method identified a life expectancy at birth (LE0) of 77•9 years and health-adjusted life expectancy at birth (HALE0) of 71•7 years for the general Chongqing population of 2017. In particular, the male LE0 and HALE0 were 76•3 years and 68•9 years, respectively, while the female LE0 and HALE0 were 80•0 years and 74•4 years, respectively. Cerebrovascular diseases, cancers, and injuries were the top three deterioration factors, which reduced HALE by 2•67, 2•15, and 1•19 years, respectively.

INTERPRETATION

The results demonstrated the feasibility and effectiveness of EMRs-based HALE estimation. Moreover, the method allowed for a potentially transferable framework that facilitated a more convenient comparison of cross-sectional and longitudinal studies on HALE between regions. In summary, this study provided insightful solutions to the global ageing and health problems that the world is facing.

FUNDING

National Key R and D Program of China (2018YFC2000400).

Collapse

Affiliation(s)

Xiaowen Ruan Ping An Technology (Shenzhen) Co., Ltd., Ping'an International Financial Center, Futian District, Shenzhen 518001, China
Yue Li China Population and Development Research Center, 12 Dahuisi Road, Haidian District, Beijing 100801, China
Xiaohui Jin Ping An Technology (Shenzhen) Co., Ltd., No. 316, Laoshan Road, Pudong New District, Shanghai 200122, China
Pan Deng Ping An Technology (Shenzhen) Co., Ltd., Ping'an International Financial Center, Futian District, Shenzhen 518001, China
Jiaying Xu Ping An Technology (Shenzhen) Co., Ltd., Ping'an International Financial Center, Futian District, Shenzhen 518001, China
Na Li Ping An Technology (Shenzhen) Co., Ltd., Ping An International Finance Centre, No. 3, South Xinyuan Road, Chaoyang District, Beijing 100011, China
Xian Li Ping An Technology (Shenzhen) Co., Ltd., Ping'an International Financial Center, Futian District, Shenzhen 518001, China
Yuqi Liu Ping An Technology (Shenzhen) Co., Ltd., Ping An International Finance Centre, No. 3, South Xinyuan Road, Chaoyang District, Beijing 100011, China
Yiyi Hu Ping An Technology (Shenzhen) Co., Ltd., No. 316, Laoshan Road, Pudong New District, Shanghai 200122, China
Jingwen Xie Ping An Technology (Shenzhen) Co., Ltd., No. 316, Laoshan Road, Pudong New District, Shanghai 200122, China
Yingnan Wu Ping An Technology (Shenzhen) Co., Ltd., Ping An International Finance Centre, No. 3, South Xinyuan Road, Chaoyang District, Beijing 100011, China
Dongyan Long Ping An Technology (Shenzhen) Co., Ltd., Ping'an International Financial Center, Futian District, Shenzhen 518001, China
Wen He Ping An Technology (Shenzhen) Co., Ltd., Ping An International Finance Centre, No. 3, South Xinyuan Road, Chaoyang District, Beijing 100011, China
Dongsheng Yuan Ping An Technology (Shenzhen) Co., Ltd., No. 316, Laoshan Road, Pudong New District, Shanghai 200122, China
Yifei Guo Ping An Technology (Shenzhen) Co., Ltd., No. 316, Laoshan Road, Pudong New District, Shanghai 200122, China
Heng Li Ping An Technology (Shenzhen) Co., Ltd., Ping'an International Financial Center, Futian District, Shenzhen 518001, China
He Huang Chongqing Municipal Health Commission, No. 232 Renmin Road, Yuzhong District, Chongqing 400015, China
Shan Yang Chongqing Municipal Health Commission, No. 232 Renmin Road, Yuzhong District, Chongqing 400015, China
Mei Han Ping An Technology (Shenzhen) Co., Ltd., Ping An Tech, US Research Lab, Suite 150, 3000 EI Camino Real, Palo Alto, CA 94306, United States
Bojin Zhuang Ping An Technology (Shenzhen) Co., Ltd., Ping'an International Financial Center, Futian District, Shenzhen 518001, China
Jiang Qian Ping An Technology (Shenzhen) Co., Ltd., Ping'an International Financial Center, Futian District, Shenzhen 518001, China
Zhenjie Cao Ping An Technology (Shenzhen) Co., Ltd., Ping An Tech, US Research Lab, Suite 150, 3000 EI Camino Real, Palo Alto, CA 94306, United States
Xuying Zhang China Population and Development Research Center, 12 Dahuisi Road, Haidian District, Beijing 100801, China
Jing Xiao Ping An Technology (Shenzhen) Co., Ltd., Ping'an International Financial Center, Futian District, Shenzhen 518001, China
Liang Xu Ping An Technology (Shenzhen) Co., Ltd., Ping'an International Financial Center, Futian District, Shenzhen 518001, China

Collapse

Chen L, Gu Y, Ji X, Sun Z, Li H, Gao Y, Huang Y. Extracting medications and associated adverse drug events using a natural language processing system combining knowledge base and deep learning. J Am Med Inform Assoc 2021;27:56-64. [PMID: 31591641 DOI: 10.1093/jamia/ocz141] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Revised: 01/25/2019] [Accepted: 07/22/2019] [Indexed: 11/13/2022] Open

Yang X, Bian J, Fang R, Bjarnadottir RI, Hogan WR, Wu Y. Identifying relations of medications with adverse drug events using recurrent convolutional neural networks and gradient boosting. J Am Med Inform Assoc 2021;27:65-72. [PMID: 31504605 DOI: 10.1093/jamia/ocz144] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Revised: 05/30/2019] [Accepted: 07/22/2019] [Indexed: 01/19/2023] Open

Lee CY, Chen YP. Descriptive prediction of drug side‐effects using a hybrid deep learning model. INT J INTELL SYST 2021. [DOI: 10.1002/int.22389] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Shi X, Yi Y, Xiong Y, Tang B, Chen Q, Wang X, Ji Z, Zhang Y, Xu H. Extracting entities with attributes in clinical text via joint deep learning. J Am Med Inform Assoc 2021;26:1584-1591. [PMID: 31550346 DOI: 10.1093/jamia/ocz158] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2019] [Revised: 07/18/2019] [Accepted: 08/15/2019] [Indexed: 11/13/2022] Open

Ibrahim MA, Ghani Khan MU, Mehmood F, Asim MN, Mahmood W. GHS-NET a generic hybridized shallow neural network for multi-label biomedical text classification. J Biomed Inform 2021;116:103699. [PMID: 33601013 DOI: 10.1016/j.jbi.2021.103699] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2020] [Revised: 11/30/2020] [Accepted: 02/02/2021] [Indexed: 01/16/2023]

Abstract

Exponential growth of biomedical literature and clinical data demands more robust yet precise computational methodologies to extract useful insights from biomedical literature and to perform accurate assignment of disease-specific codes. Such approaches can largely enhance the effectiveness of diverse biomedicine and bioinformatics applications. State-of-the-art computational biomedical text classification methodologies either solely leverage discrimintaive features extracted through convolution operations performed by deep convolutional neural network or contextual information extracted by recurrent neural network. However, none of the methodology takes advantage of both convolutional and recurrent neural networks. Further, existing methodologies lack to produce decent performance for the classification of different genre biomedical text such as biomedical literature or clinical notes. We, for the very first time, present a generic deep learning based hybrid multi-label classification methodology namely GHS-NET which can be utilized to accurately classify biomedical text of diverse genre. GHS-NET makes use of convolutional neural network to extract most discriminative features and bi-directional Long Short-Term Memory to acquire contextual information. GHS-NET effectiveness is evaluated for extreme multi-label biomedical literature classification and assignment of ICD-9 codes to clinical notes. For the task of extreme multi-label biomedical literature classification, performance comparison of GHS-Net and state-of-the-art deep learning based methodology reveals that GHS-Net marks the increment of 1%, 6%, and 1% for hallmarks of cancer dataset, 10%, 16%, and 11% for chemical exposure dataset in terms of precision, recall, and F1-score. For the task of clinical notes classification, GHS-Net outperforms previous best deep learning based methodology over Medical Information Mart for Intensive Care dataset (MIMIC-III) by the significant margin of 6%, 8% in terms of recall and F1-score. GHS-NET is available as a web service at¹ and potentially can be used to accurately classify multi-variate disease and chemical exposure specific text.

Collapse

Oleynik M, Kugic A, Kasáč Z, Kreuzthaler M. Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification. J Am Med Inform Assoc 2021;26:1247-1254. [PMID: 31512729 PMCID: PMC6798565 DOI: 10.1093/jamia/ocz149] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Revised: 06/29/2019] [Accepted: 07/31/2019] [Indexed: 12/17/2022] Open

Abstract

Objective

Automated clinical phenotyping is challenging because word-based features quickly turn it into a high-dimensional problem, in which the small, privacy-restricted, training datasets might lead to overfitting. Pretrained embeddings might solve this issue by reusing input representation schemes trained on a larger dataset. We sought to evaluate shallow and deep learning text classifiers and the impact of pretrained embeddings in a small clinical dataset.

Materials and Methods

We participated in the 2018 National NLP Clinical Challenges (n2c2) Shared Task on cohort selection and received an annotated dataset with medical narratives of 202 patients for multilabel binary text classification. We set our baseline to a majority classifier, to which we compared a rule-based classifier and orthogonal machine learning strategies: support vector machines, logistic regression, and long short-term memory neural networks. We evaluated logistic regression and long short-term memory using both self-trained and pretrained BioWordVec word embeddings as input representation schemes.

Results

Rule-based classifier showed the highest overall micro F₁ score (0.9100), with which we finished first in the challenge. Shallow machine learning strategies showed lower overall micro F₁ scores, but still higher than deep learning strategies and the baseline. We could not show a difference in classification efficiency between self-trained and pretrained embeddings.

Discussion

Clinical context, negation, and value-based criteria hindered shallow machine learning approaches, while deep learning strategies could not capture the term diversity due to the small training dataset.

Conclusion

Shallow methods for clinical phenotyping can still outperform deep learning methods in small imbalanced data, even when supported by pretrained embeddings.

Collapse

Mitra A, Rawat BPS, McManus D, Kapoor A, Yu H. Bleeding Entity Recognition in Electronic Health Records: A Comprehensive Analysis of End-to-End Systems. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2021;2020:860-869. [PMID: 33936461 PMCID: PMC8075442] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Rawat BPS, Jagannatha A, Liu F, Yu H. Inferring ADR causality by predicting the Naranjo Score from Clinical Notes. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2021;2020:1041-1049. [PMID: 33936480 PMCID: PMC8075501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Liu F, Zheng X, Yu H, Tjia J. Neural Multi-Task Learning for Adverse Drug Reaction Extraction. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2021;2020:756-762. [PMID: 33936450 PMCID: PMC8075418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Alakus TB, Turkoglu I. A Novel Protein Mapping Method for Predicting the Protein Interactions in COVID-19 Disease by Deep Learning. Interdiscip Sci 2021;13:44-60. [PMID: 33433784 PMCID: PMC7801232 DOI: 10.1007/s12539-020-00405-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Revised: 11/23/2020] [Accepted: 11/28/2020] [Indexed: 12/11/2022]

Abstract

The new type of corona virus (SARS-COV-2) emerging in Wuhan, China has spread rapidly to the world and has become a pandemic. In addition to having a significant impact on daily life, it also shows its effect in different areas, including public health and economy. Currently, there is no vaccine or antiviral drug available to prevent the COVID-19 disease. Therefore, determination of protein interactions of new types of corona virus is vital in clinical studies, drug therapy, identification of preclinical compounds and protein functions. Protein–protein interactions are important to examine protein functions and pathways involved in various biological processes and to determine the cause and progression of diseases. Various high-throughput experimental methods have been used to identify protein–protein interactions in organisms, yet, there is still a huge gap in specifying all possible protein interactions in an organism. In addition, since the experimental methods used include cloning, labeling, affinity purification mass spectrometry, the processes take a long time. Determining these interactions with artificial intelligence-based methods rather than experimental approaches may help to identify protein functions faster. Thus, protein–protein interaction prediction using deep-learning algorithms has been employed in conjunction with experimental method to explore new protein interactions. However, to predict protein interactions with artificial intelligence techniques, protein sequences need to be mapped. There are various types and numbers of protein-mapping methods in the literature. In this study, we wanted to contribute to the literature by proposing a novel protein-mapping method based on the AVL tree. The proposed method was inspired by the fast search performance on the dictionary structure of AVL tree and was used to verify the protein interactions between SARS-COV-2 virus and human. First, protein sequences were mapped by both the proposed method and various protein-mapping methods. Then, the mapped protein sequences were normalized and classified by bidirectional recurrent neural networks. The performance of the proposed method was evaluated with accuracy, f1-score, precision, recall, and AUC scores. Our results indicated that our mapping method predicts the protein interactions between SARS-COV-2 virus proteins and human proteins at an accuracy of 97.76%, precision of 97.60%, recall of 98.33%, f1-score of 79.42%, and with AUC 89% in average.

Collapse

Chen TL, Emerling M, Chaudhari GR, Chillakuru YR, Seo Y, Vu TH, Sohn JH. Domain specific word embeddings for natural language processing in radiology. J Biomed Inform 2021;113:103665. [PMID: 33333323 PMCID: PMC7856086 DOI: 10.1016/j.jbi.2020.103665] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Revised: 11/03/2020] [Accepted: 12/10/2020] [Indexed: 11/25/2022]

Abstract

BACKGROUND

There has been increasing interest in machine learning based natural language processing (NLP) methods in radiology; however, models have often used word embeddings trained on general web corpora due to lack of a radiology-specific corpus.

PURPOSE

We examined the potential of Radiopaedia to serve as a general radiology corpus to produce radiology specific word embeddings that could be used to enhance performance on a NLP task on radiological text.

MATERIALS AND METHODS

Embeddings of dimension 50, 100, 200, and 300 were trained on articles collected from Radiopaedia using a GloVe algorithm and evaluated on analogy completion. A shallow neural network using input from either our trained embeddings or pre-trained Wikipedia 2014 + Gigaword 5 (WG) embeddings was used to label the Radiopaedia articles. Labeling performance was evaluated based on exact match accuracy and Hamming loss. The McNemar's test with continuity and the Benjamini-Hochberg correction and a 5×2 cross validation paired two-tailed t-test were used to assess statistical significance.

RESULTS

For accuracy in the analogy task, 50-dimensional (50-D) Radiopaedia embeddings outperformed WG embeddings on tumor origin analogies (p < 0.05) and organ adjectives (p < 0.01) whereas WG embeddings tended to outperform on inflammation location and bone vs. muscle analogies (p < 0.01). The two embeddings had comparable performance on other subcategories. In the labeling task, the Radiopaedia-based model outperformed the WG based model at 50, 100, 200, and 300-D for exact match accuracy (p < 0.001, p < 0.001, p < 0.01, and p < 0.05, respectively) and Hamming loss (p < 0.001, p < 0.001, p < 0.01, and p < 0.05, respectively).

CONCLUSION

We have developed a set of word embeddings from Radiopaedia and shown that they can preserve relevant medical semantics and augment performance on a radiology NLP task. Our results suggest that the cultivation of a radiology-specific corpus can benefit radiology NLP models in the future.

Collapse

Rashidian S, Abell-Hart K, Hajagos J, Moffitt R, Lingam V, Garcia V, Tsai CW, Wang F, Dong X, Sun S, Deng J, Gupta R, Miller J, Saltz J, Saltz M. Detecting Miscoded Diabetes Diagnosis Codes in Electronic Health Records for Quality Improvement: Temporal Deep Learning Approach. JMIR Med Inform 2020;8:e22649. [PMID: 33331828 PMCID: PMC7775195 DOI: 10.2196/22649] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2020] [Revised: 09/24/2020] [Accepted: 09/27/2020] [Indexed: 01/16/2023] Open

Affiliation(s)

Sina Rashidian Department of Computer Science, Stony Brook University, Stony Brook, NY, United States
Kayley Abell-Hart Department of Biomedical Informatics, Renaissance School of Medicine at Stony Brook, Stony Brook, NY, United States
Janos Hajagos Department of Biomedical Informatics, Renaissance School of Medicine at Stony Brook, Stony Brook, NY, United States
Richard Moffitt Department of Biomedical Informatics, Renaissance School of Medicine at Stony Brook, Stony Brook, NY, United States
Veena Lingam Department of Biomedical Informatics, Renaissance School of Medicine at Stony Brook, Stony Brook, NY, United States
Victor Garcia Department of Biomedical Informatics, Renaissance School of Medicine at Stony Brook, Stony Brook, NY, United States
Chao-Wei Tsai Department of Biomedical Informatics, Renaissance School of Medicine at Stony Brook, Stony Brook, NY, United States
Fusheng Wang Department of Computer Science, Stony Brook University, Stony Brook, NY, United States Department of Biomedical Informatics, Renaissance School of Medicine at Stony Brook, Stony Brook, NY, United States
Xinyu Dong Department of Computer Science, Stony Brook University, Stony Brook, NY, United States
Siao Sun Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, United States
Jianyuan Deng Department of Biomedical Informatics, Renaissance School of Medicine at Stony Brook, Stony Brook, NY, United States
Rajarsi Gupta Department of Biomedical Informatics, Renaissance School of Medicine at Stony Brook, Stony Brook, NY, United States
Joshua Miller Department of Medicine, Renaissance School of Medicine at Stony Brook, Stony Brook, NY, United States
Joel Saltz Department of Biomedical Informatics, Renaissance School of Medicine at Stony Brook, Stony Brook, NY, United States
Mary Saltz Department of Biomedical Informatics, Renaissance School of Medicine at Stony Brook, Stony Brook, NY, United States

Collapse

Routray R, Tetarenko N, Abu-Assal C, Mockute R, Assuncao B, Chen H, Bao S, Danysz K, Desai S, Cicirello S, Willis V, Alford SH, Krishnamurthy V, Mingle E. Application of Augmented Intelligence for Pharmacovigilance Case Seriousness Determination. Drug Saf 2020;43:57-66. [PMID: 31605285 PMCID: PMC6965337 DOI: 10.1007/s40264-019-00869-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]

Abstract

INTRODUCTION

Identification of adverse events and determination of their seriousness ensures timely detection of potential patient safety concerns. Adverse event seriousness is a key factor in defining reporting timelines and is often performed manually by pharmacovigilance experts. The dramatic increase in the volume of safety reports necessitates exploration of scalable solutions that also meet reporting timeline requirements.

OBJECTIVE

The aim of this study was to develop an augmented intelligence methodology for automatically identifying adverse event seriousness in spontaneous, solicited, and medical literature safety reports. Deep learning models were evaluated for accuracy and/or the F1 score against a ground truth labeled by pharmacovigilance experts.

METHODS

Using a stratified random sample of safety reports received by Celgene, we developed three neural networks for addressing identification of adverse event seriousness: (1) a binary adverse-event level seriousness classifier; (2) a classifier for determining seriousness categorization at the adverse-event level; and (3) an annotator for identifying seriousness criteria terms to provide supporting evidence at the document level.

RESULTS

The seriousness classifier achieved an accuracy of 83.0% in post-marketing reports, 92.9% in solicited reports, and 86.3% in medical literature reports. F1 scores for seriousness categorization were 77.7 for death, 78.9 for hospitalization, and 75.5 for important medical events. The seriousness annotator achieved an F1 score of 89.9 in solicited reports, and 75.2 in medical literature reports.

CONCLUSIONS

The results of this study indicate that a neural network approach can provide an accurate and scalable solution for potentially augmenting pharmacovigilance practitioner determination of adverse event seriousness in spontaneous, solicited, and medical literature reports.

Collapse

Zeng L, Ren W, Shan L. Attention-based bidirectional gated recurrent unit neural networks for well logs prediction and lithology identification. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.07.026] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Hahn U, Oleynik M. Medical Information Extraction in the Age of Deep Learning. Yearb Med Inform 2020;29:208-220. [PMID: 32823318 PMCID: PMC7442512 DOI: 10.1055/s-0040-1702001] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open

Slattery SM, Knight DC, Weese‐Mayer DE, Grobman WA, Downey DC, Murthy K. Machine learning mortality classification in clinical documentation with increased accuracy in visual-based analyses. Acta Paediatr 2020;109:1346-1353. [PMID: 31762098 DOI: 10.1111/apa.15109] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/29/2019] [Revised: 11/21/2019] [Accepted: 11/22/2019] [Indexed: 11/27/2022]

Liu S, Li T, Ding H, Tang B, Wang X, Chen Q, Yan J, Zhou Y. A hybrid method of recurrent neural network and graph neural network for next-period prescription prediction. INT J MACH LEARN CYB 2020;11:2849-2856. [PMID: 33727983 PMCID: PMC7308113 DOI: 10.1007/s13042-020-01155-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2019] [Accepted: 06/10/2020] [Indexed: 01/17/2023]

Gao S, Alawad M, Schaefferkoetter N, Penberthy L, Wu XC, Durbin EB, Coyle L, Ramanathan A, Tourassi G. Using case-level context to classify cancer pathology reports. PLoS One 2020;15:e0232840. [PMID: 32396579 PMCID: PMC7217446 DOI: 10.1371/journal.pone.0232840] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Accepted: 04/22/2020] [Indexed: 11/18/2022] Open

Spasic I, Nenadic G. Clinical Text Data in Machine Learning: Systematic Review. JMIR Med Inform 2020;8:e17984. [PMID: 32229465 PMCID: PMC7157505 DOI: 10.2196/17984] [Citation(s) in RCA: 115] [Impact Index Per Article: 28.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Revised: 02/24/2020] [Accepted: 02/24/2020] [Indexed: 12/22/2022] Open

Abstract

Background

Clinical narratives represent the main form of communication within health care, providing a personalized account of patient history and assessments, and offering rich information for clinical decision making. Natural language processing (NLP) has repeatedly demonstrated its feasibility to unlock evidence buried in clinical narratives. Machine learning can facilitate rapid development of NLP tools by leveraging large amounts of text data.

Objective

The main aim of this study was to provide systematic evidence on the properties of text data used to train machine learning approaches to clinical NLP. We also investigated the types of NLP tasks that have been supported by machine learning and how they can be applied in clinical practice.

Methods

Our methodology was based on the guidelines for performing systematic reviews. In August 2018, we used PubMed, a multifaceted interface, to perform a literature search against MEDLINE. We identified 110 relevant studies and extracted information about text data used to support machine learning, NLP tasks supported, and their clinical applications. The data properties considered included their size, provenance, collection methods, annotation, and any relevant statistics.

Results

The majority of datasets used to train machine learning models included only hundreds or thousands of documents. Only 10 studies used tens of thousands of documents, with a handful of studies utilizing more. Relatively small datasets were utilized for training even when much larger datasets were available. The main reason for such poor data utilization is the annotation bottleneck faced by supervised machine learning algorithms. Active learning was explored to iteratively sample a subset of data for manual annotation as a strategy for minimizing the annotation effort while maximizing the predictive performance of the model. Supervised learning was successfully used where clinical codes integrated with free-text notes into electronic health records were utilized as class labels. Similarly, distant supervision was used to utilize an existing knowledge base to automatically annotate raw text. Where manual annotation was unavoidable, crowdsourcing was explored, but it remains unsuitable because of the sensitive nature of data considered. Besides the small volume, training data were typically sourced from a small number of institutions, thus offering no hard evidence about the transferability of machine learning models. The majority of studies focused on text classification. Most commonly, the classification results were used to support phenotyping, prognosis, care improvement, resource management, and surveillance.

Conclusions

We identified the data annotation bottleneck as one of the key obstacles to machine learning approaches in clinical NLP. Active learning and distant supervision were explored as a way of saving the annotation efforts. Future research in this field would benefit from alternatives such as data augmentation and transfer learning, or unsupervised learning, which do not require data annotation.

Collapse

Ju M, Short AD, Thompson P, Bakerly ND, Gkoutos GV, Tsaprouni L, Ananiadou S. Annotating and detecting phenotypic information for chronic obstructive pulmonary disease. JAMIA Open 2020;2:261-271. [PMID: 31984360 PMCID: PMC6951876 DOI: 10.1093/jamiaopen/ooz009] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2018] [Revised: 02/21/2019] [Accepted: 03/19/2019] [Indexed: 12/29/2022] Open

Abstract

Objectives

Chronic obstructive pulmonary disease (COPD) phenotypes cover a range of lung abnormalities. To allow text mining methods to identify pertinent and potentially complex information about these phenotypes from textual data, we have developed a novel annotated corpus, which we use to train a neural network-based named entity recognizer to detect fine-grained COPD phenotypic information.

Materials and methods

Since COPD phenotype descriptions often mention other concepts within them (proteins, treatments, etc.), our corpus annotations include both outermost phenotype descriptions and concepts nested within them. Our neural layered bidirectional long short-term memory conditional random field (BiLSTM-CRF) network firstly recognizes nested mentions, which are fed into subsequent BiLSTM-CRF layers, to help to recognize enclosing phenotype mentions.

Results

Our corpus of 30 full papers (available at: http://www.nactem.ac.uk/COPD) is annotated by experts with 27 030 phenotype-related concept mentions, most of which are automatically linked to UMLS Metathesaurus concepts. When trained using the corpus, our BiLSTM-CRF network outperforms other popular approaches in recognizing detailed phenotypic information.

Discussion

Information extracted by our method can facilitate efficient location and exploration of detailed information about phenotypes, for example, those specifically concerning reactions to treatments.

Conclusion

The importance of our corpus for developing methods to extract fine-grained information about COPD phenotypes is demonstrated through its successful use to train a layered BiLSTM-CRF network to extract phenotypic information at various levels of granularity. The minimal human intervention needed for training should permit ready adaption to extracting phenotypic information about other diseases.

Collapse

SECNLP: A survey of embeddings in clinical natural language processing. J Biomed Inform 2020;101:103323. [DOI: 10.1016/j.jbi.2019.103323] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2019] [Revised: 09/12/2019] [Accepted: 10/27/2019] [Indexed: 12/11/2022]

Svenson P, Haralabopoulos G, Torres Torres M. Sepsis Deterioration Prediction Using Channelled Long Short-Term Memory Networks. Artif Intell Med 2020. [DOI: 10.1007/978-3-030-59137-3_32] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Weegar R, Pérez A, Casillas A, Oronoz M. Recent advances in Swedish and Spanish medical entity recognition in clinical texts using deep neural approaches. BMC Med Inform Decis Mak 2019;19:274. [PMID: 31865900 PMCID: PMC6927099 DOI: 10.1186/s12911-019-0981-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Li Y, Jin R, Luo Y. Classifying relations in clinical narratives using segment graph convolutional and recurrent neural networks (Seg-GCRNs). J Am Med Inform Assoc 2019;26:262-268. [PMID: 30590613 DOI: 10.1093/jamia/ocy157] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2018] [Accepted: 11/03/2018] [Indexed: 01/16/2023] Open

Rawat BPS, Li F, Yu H. Naranjo Question Answering using End-to-End Multi-task Learning Model. KDD : PROCEEDINGS. INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING 2019;2019:2547-2555. [PMID: 31799022 DOI: 10.1145/3292500.3330770] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

AlSaad R, Malluhi Q, Janahi I, Boughorbel S. Interpreting patient-Specific risk prediction using contextual decomposition of BiLSTMs: application to children with asthma. BMC Med Inform Decis Mak 2019;19:214. [PMID: 31703676 PMCID: PMC6842261 DOI: 10.1186/s12911-019-0951-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2019] [Accepted: 10/28/2019] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Predictive modeling with longitudinal electronic health record (EHR) data offers great promise for accelerating personalized medicine and better informs clinical decision-making. Recently, deep learning models have achieved state-of-the-art performance for many healthcare prediction tasks. However, deep models lack interpretability, which is integral to successful decision-making and can lead to better patient care. In this paper, we build upon the contextual decomposition (CD) method, an algorithm for producing importance scores from long short-term memory networks (LSTMs). We extend the method to bidirectional LSTMs (BiLSTMs) and use it in the context of predicting future clinical outcomes using patients' EHR historical visits.

METHODS

We use a real EHR dataset comprising 11071 patients, to evaluate and compare CD interpretations from LSTM and BiLSTM models. First, we train LSTM and BiLSTM models for the task of predicting which pre-school children with respiratory system-related complications will have asthma at school-age. After that, we conduct quantitative and qualitative analysis to evaluate the CD interpretations produced by the contextual decomposition of the trained models. In addition, we develop an interactive visualization to demonstrate the utility of CD scores in explaining predicted outcomes.

RESULTS

Our experimental evaluation demonstrate that whenever a clear visit-level pattern exists, the models learn that pattern and the contextual decomposition can appropriately attribute the prediction to the correct pattern. In addition, the results confirm that the CD scores agree to a large extent with the importance scores generated using logistic regression coefficients. Our main insight was that rather than interpreting the attribution of individual visits to the predicted outcome, we could instead attribute a model's prediction to a group of visits.

CONCLUSION

We presented a quantitative and qualitative evidence that CD interpretations can explain patient-specific predictions using CD attributions of individual visits or a group of visits.

Collapse

Jin Y, Li F, Vimalananda VG, Yu H. Automatic Detection of Hypoglycemic Events From the Electronic Health Record Notes of Diabetes Patients: Empirical Study. JMIR Med Inform 2019;7:e14340. [PMID: 31702562 PMCID: PMC6913754 DOI: 10.2196/14340] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Revised: 07/08/2019] [Accepted: 10/19/2019] [Indexed: 01/25/2023] Open

Malmasi S, Ge W, Hosomura N, Turchin A. Comparing information extraction techniques for low-prevalence concepts: The case of insulin rejection by patients. J Biomed Inform 2019;99:103306. [DOI: 10.1016/j.jbi.2019.103306] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Revised: 09/23/2019] [Accepted: 10/10/2019] [Indexed: 02/05/2023]

Savova GK, Danciu I, Alamudun F, Miller T, Lin C, Bitterman DS, Tourassi G, Warner JL. Use of Natural Language Processing to Extract Clinical Cancer Phenotypes from Electronic Medical Records. Cancer Res 2019;79:5463-5470. [PMID: 31395609 PMCID: PMC7227798 DOI: 10.1158/0008-5472.can-19-0579] [Citation(s) in RCA: 75] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2019] [Revised: 06/17/2019] [Accepted: 07/29/2019] [Indexed: 12/12/2022]