1
|
Klein AZ, Banda JM, Guo Y, Schmidt AL, Xu D, Flores Amaro I, Rodriguez-Esteban R, Sarker A, Gonzalez-Hernandez G. Overview of the 8th Social Media Mining for Health Applications (#SMM4H) shared tasks at the AMIA 2023 Annual Symposium. J Am Med Inform Assoc 2024; 31:991-996. [PMID: 38218723 PMCID: PMC10990511 DOI: 10.1093/jamia/ocae010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 01/05/2024] [Accepted: 01/11/2024] [Indexed: 01/15/2024] Open
Abstract
OBJECTIVE The aim of the Social Media Mining for Health Applications (#SMM4H) shared tasks is to take a community-driven approach to address the natural language processing and machine learning challenges inherent to utilizing social media data for health informatics. In this paper, we present the annotated corpora, a technical summary of participants' systems, and the performance results. METHODS The eighth iteration of the #SMM4H shared tasks was hosted at the AMIA 2023 Annual Symposium and consisted of 5 tasks that represented various social media platforms (Twitter and Reddit), languages (English and Spanish), methods (binary classification, multi-class classification, extraction, and normalization), and topics (COVID-19, therapies, social anxiety disorder, and adverse drug events). RESULTS In total, 29 teams registered, representing 17 countries. In general, the top-performing systems used deep neural network architectures based on pre-trained transformer models. In particular, the top-performing systems for the classification tasks were based on single models that were pre-trained on social media corpora. CONCLUSION To facilitate future work, the datasets-a total of 61 353 posts-will remain available by request, and the CodaLab sites will remain active for a post-evaluation phase.
Collapse
Affiliation(s)
- Ari Z Klein
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Juan M Banda
- Department of Computer Science, Georgia State University, Atlanta, GA 30302, United States
| | - Yuting Guo
- Department of Biomedical Informatics, Emory University, Atlanta, GA 30322, United States
| | | | - Dongfang Xu
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA 90048, United States
| | - Ivan Flores Amaro
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA 90048, United States
| | | | - Abeed Sarker
- Department of Biomedical Informatics, Emory University, Atlanta, GA 30322, United States
| | | |
Collapse
|
2
|
Lau-Min KS, Marini J, Shah NK, Pucci D, Blauch AN, Cambareri C, Mooney B, Agarwal P, Johnston C, Schumacher RP, White K, Gabriel PE, Rosin R, Jacobs LA, Shulman LN. Pilot Study of a Mobile Phone Chatbot for Medication Adherence and Toxicity Management Among Patients With GI Cancers on Capecitabine. JCO Oncol Pract 2024; 20:483-490. [PMID: 38237102 DOI: 10.1200/op.23.00365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 10/11/2023] [Accepted: 12/04/2023] [Indexed: 04/12/2024] Open
Abstract
PURPOSE Capecitabine is an oral chemotherapy used to treat many gastrointestinal cancers. Its complex dosing and narrow therapeutic index make medication adherence and toxicity management crucial for quality care. METHODS We conducted a pilot study of PENNY-GI, a mobile phone text messaging-based chatbot that leverages algorithmic surveys and natural language processing to promote medication adherence and toxicity management among patients with gastrointestinal cancers on capecitabine. Eligibility initially included all capecitabine-containing regimens but was subsequently restricted to capecitabine monotherapy because of challenges in integrating PENNY-GI with radiation and intravenous chemotherapy schedules. We used design thinking principles and real-time data on safety, accuracy, and usefulness to make iterative refinements to PENNY-GI with the goal of minimizing the proportion of text messaging exchanges with incorrect medication or symptom management recommendations. All patients were invited to participate in structured exit interviews to provide feedback on PENNY-GI. RESULTS We enrolled 40 patients (median age 64.5 years, 52.5% male, 62.5% White, 55.0% with colorectal cancer, 50.0% on capecitabine monotherapy). We identified 284 of 3,895 (7.3%) medication-related and 13 of 527 (2.5%) symptom-related text messaging exchanges with incorrect recommendations. In exit interviews with 24 patients, participants reported finding the medication reminders reliable and user-friendly, but the symptom management tool was too simplistic to be helpful. CONCLUSION Although PENNY-GI provided accurate recommendations in >90% of text messaging exchanges, we identified multiple limitations with respect to the intervention's generalizability, usefulness, and scalability. Lessons from this pilot study should inform future efforts to develop and implement digital health interventions in oncology.
Collapse
Affiliation(s)
- Kelsey S Lau-Min
- Division of Hematology/Oncology, Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA
| | - Jessica Marini
- Hospital of the University of Pennsylvania, Penn Medicine, Philadelphia, PA
| | - Nishant K Shah
- Department of Radiation Oncology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| | - Donna Pucci
- Division of Hematology/Oncology, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| | - Abigail N Blauch
- Division of Hematology/Oncology, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| | | | - Bethany Mooney
- Division of Hematology/Oncology, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| | - Parul Agarwal
- Division of Hematology/Oncology, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| | | | | | | | - Peter E Gabriel
- Department of Radiation Oncology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| | - Roy Rosin
- Center for Health Care Innovation, Penn Medicine, Philadelphia, PA
| | - Linda A Jacobs
- Division of Hematology/Oncology, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| | - Lawrence N Shulman
- Division of Hematology/Oncology, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| |
Collapse
|
3
|
Klein AZ, Banda JM, Guo Y, Schmidt AL, Xu D, Amaro JIF, Rodriguez-Esteban R, Sarker A, Gonzalez-Hernandez G. Overview of the 8 th Social Media Mining for Health Applications (#SMM4H) Shared Tasks at the AMIA 2023 Annual Symposium. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.11.06.23298168. [PMID: 37986776 PMCID: PMC10659479 DOI: 10.1101/2023.11.06.23298168] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
The aim of the Social Media Mining for Health Applications (#SMM4H) shared tasks is to take a community-driven approach to address the natural language processing and machine learning challenges inherent to utilizing social media data for health informatics. The eighth iteration of the #SMM4H shared tasks was hosted at the AMIA 2023 Annual Symposium and consisted of five tasks that represented various social media platforms (Twitter and Reddit), languages (English and Spanish), methods (binary classification, multi-class classification, extraction, and normalization), and topics (COVID-19, therapies, social anxiety disorder, and adverse drug events). In total, 29 teams registered, representing 18 countries. In this paper, we present the annotated corpora, a technical summary of the systems, and the performance results. In general, the top-performing systems used deep neural network architectures based on pre-trained transformer models. In particular, the top-performing systems for the classification tasks were based on single models that were pre-trained on social media corpora. To facilitate future work, the datasets-a total of 61,353 posts-will remain available by request, and the CodaLab sites will remain active for a post-evaluation phase.
Collapse
Affiliation(s)
- Ari Z. Klein
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - Juan M. Banda
- Department of Computer Science, Georgia State University, Atlanta, GA, USA
| | - Yuting Guo
- Department of Biomedical Informatics, Emory University, Atlanta, GA, USA
| | | | - Dongfang Xu
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | | | | | - Abeed Sarker
- Department of Biomedical Informatics, Emory University, Atlanta, GA, USA
| | | |
Collapse
|
4
|
Golder S, O'Connor K, Wang Y, Gonzalez Hernandez G. The Role of Social Media for Identifying Adverse Drug Events Data in Pharmacovigilance: Protocol for a Scoping Review. JMIR Res Protoc 2023; 12:e47068. [PMID: 37531158 PMCID: PMC10433020 DOI: 10.2196/47068] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 05/05/2023] [Accepted: 05/06/2023] [Indexed: 08/03/2023] Open
Abstract
BACKGROUND Adverse drug events (ADEs) are a considerable public health burden resulting in disability, hospitalization, and death. Even those ADEs deemed nonserious can severely impact a patient's quality of life and adherence to intervention. Monitoring medication safety, however, is challenging. Social media may be a useful adjunct for obtaining real-world data on ADEs. While many studies have been undertaken to detect adverse events on social media, a consensus has not yet been reached as to the value of social media in pharmacovigilance or its role in pharmacovigilance in relation to more traditional data sources. OBJECTIVE The aim of the study is to evaluate and characterize the use of social media in ADE detection and pharmacovigilance as compared to other data sources. METHODS A scoping review will be undertaken. We will search 11 bibliographical databases as well as Google Scholar, hand-searching, and forward and backward citation searching. Records will be screened in Covidence by 2 independent reviewers at both title and abstract stage as well as full text. Studies will be included if they used any type of social media (such as Twitter or patient forums) to detect any type of adverse event associated with any type of medication and then compared the results from social media to any other data source (such as spontaneous reporting systems or clinical literature). Data will be extracted using a data extraction sheet piloted by the authors. Important data on the types of methods used (such as machine learning), any limitations of the methods used, types of adverse events and drugs searched for and included, availability of data and code, details of the comparison data source, and the results and conclusions will be extracted. RESULTS We will present descriptive summary statistics as well as identify any patterns in the types and timing of ADEs detected, including but not limited to the similarities and differences in what is reported, gaps in the evidence, and the methods used to extract ADEs from social media data. We will also summarize how the data from social media compares to conventional data sources. The literature will be organized by the data source for comparison. Where possible, we will analyze the impact of the types of adverse events, the social media platform used, and the methods used. CONCLUSIONS This scoping review will provide a valuable summary of a large body of research and important information for pharmacovigilance as well as suggest future directions of further research in this area. Through the comparisons with other data sources, we will be able to conclude the added value of social media in monitoring adverse events of medications, in terms of type of adverse events and timing. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) PRR1-10.2196/47068.
Collapse
Affiliation(s)
- Su Golder
- Department of Health Sciences, University of York, York, United Kingdom
| | - Karen O'Connor
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Yunwen Wang
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, West Hollywood, CA, United States
| | | |
Collapse
|
5
|
Dietrich J, Kazzer P. Provision and Characterization of a Corpus for Pharmaceutical, Biomedical Named Entity Recognition for Pharmacovigilance: Evaluation of Language Registers and Training Data Sufficiency. Drug Saf 2023; 46:765-779. [PMID: 37338799 PMCID: PMC10345043 DOI: 10.1007/s40264-023-01322-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/16/2023] [Indexed: 06/21/2023]
Abstract
INTRODUCTION AND OBJECTIVE Machine learning (ML) systems are widely used for automatic entity recognition in pharmacovigilance. Publicly available datasets do not allow the use of annotated entities independently, focusing on small entity subsets or on single language registers (informal or scientific language). The objective of the current study was to create a dataset that enables independent usage of entities, explores the performance of predictive ML models on different registers, and introduces a method to investigate entity cut-off performance. METHODS A dataset has been created combining different registers with 18 different entities. We applied this dataset to compare the performance of integrated models with models created with single language registers only. We introduced fractional stratified k-fold cross-validation to determine model performance on entity level by using training dataset fractions. We investigated the course of entity performance with fractions of training datasets and evaluated entity peak and cut-off performance. RESULTS The dataset combines 1400 records (scientific language: 790; informal language: 610) with 2622 sentences and 9989 entity occurrences and combines data from external (801 records) and internal sources (599 records). We demonstrated that single language register models underperform compared to integrated models trained with multiple language registers. CONCLUSIONS A manually annotated dataset with a variety of different pharmaceutical and biomedical entities was created and is made available to the research community. Our results show that models that combine different registers provide better maintainability, have higher robustness, and have similar or higher performance. Fractional stratified k-fold cross-validation allows the evaluation of training data sufficiency on the entity level.
Collapse
Affiliation(s)
- Jürgen Dietrich
- Bayer AG, Pharmaceuticals, Medical Affairs & Pharmacovigilance, Data Science & Insights, Müllerstr. 170, 13353, Berlin, Germany.
| | | |
Collapse
|
6
|
French E, McInnes BT. An overview of biomedical entity linking throughout the years. J Biomed Inform 2023; 137:104252. [PMID: 36464228 PMCID: PMC9845184 DOI: 10.1016/j.jbi.2022.104252] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 09/19/2022] [Accepted: 11/15/2022] [Indexed: 12/04/2022]
Abstract
Biomedical Entity Linking (BEL) is the task of mapping of spans of text within biomedical documents to normalized, unique identifiers within an ontology. This is an important task in natural language processing for both translational information extraction applications and providing context for downstream tasks like relationship extraction. In this paper, we will survey the progression of BEL from its inception in the late 80s to present day state of the art systems, provide a comprehensive list of datasets available for training BEL systems, reference shared tasks focused on BEL, discuss the technical components that comprise BEL systems, and discuss possible directions for the future of the field.
Collapse
Affiliation(s)
- Evan French
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA.
| | - Bridget T McInnes
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| |
Collapse
|
7
|
Guellil I, Wu J, Wu H, Sun T, Alex B. Edinburgh_UCL_Health@SMM4H'22: From Glove to Flair for handling imbalanced healthcare corpora related to Adverse Drug Events, Change in medication and self-reporting vaccination. PROCEEDINGS OF COLING. INTERNATIONAL CONFERENCE ON COMPUTATIONAL LINGUISTICS 2022; 2022:148-152. [PMID: 36338790 PMCID: PMC7613791] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
This paper reports on the performance of Edin-burgh_UCL_Health's models in the Social Media Mining for Health (SMM4H) 2022 shared tasks. Our team participated in the tasks related to the Identification of Adverse Drug Events (ADEs), the classification of change in medication (change-med) and the classification of selfreport of vaccination (self-vaccine). Our best performing models are based on DeepADEM-iner (with respective F1= 0.64, 0.62 and 0.39 for ADE identification), on a GloVe model trained on Twitter (with F1=0.11 for the changemed) and finally on a stack embedding including a layer of Glove embedding and two layers of Flair embedding (with F1= 0.77 for selfreport).
Collapse
|
8
|
Gonzalez-Hernandez G, Krallinger M, Muñoz M, Rodriguez-Esteban R, Uzuner Ö, Hirschman L. Challenges and opportunities for mining adverse drug reactions: perspectives from pharma, regulatory agencies, healthcare providers and consumers. Database (Oxford) 2022; 2022:baac071. [PMID: 36050787 PMCID: PMC9436770 DOI: 10.1093/database/baac071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Revised: 07/08/2022] [Accepted: 08/25/2022] [Indexed: 11/17/2022]
Abstract
Monitoring drug safety is a central concern throughout the drug life cycle. Information about toxicity and adverse events is generated at every stage of this life cycle, and stakeholders have a strong interest in applying text mining and artificial intelligence (AI) methods to manage the ever-increasing volume of this information. Recognizing the importance of these applications and the role of challenge evaluations to drive progress in text mining, the organizers of BioCreative VII (Critical Assessment of Information Extraction in Biology) convened a panel of experts to explore 'Challenges in Mining Drug Adverse Reactions'. This article is an outgrowth of the panel; each panelist has highlighted specific text mining application(s), based on their research and their experiences in organizing text mining challenge evaluations. While these highlighted applications only sample the complexity of this problem space, they reveal both opportunities and challenges for text mining to aid in the complex process of drug discovery, testing, marketing and post-market surveillance. Stakeholders are eager to embrace natural language processing and AI tools to help in this process, provided that these tools can be demonstrated to add value to stakeholder workflows. This creates an opportunity for the BioCreative community to work in partnership with regulatory agencies, pharma and the text mining community to identify next steps for future challenge evaluations.
Collapse
Affiliation(s)
- Graciela Gonzalez-Hernandez
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, 700 N. San Vicente Blvd., West Hollywood, CA 90069, USA
| | - Martin Krallinger
- Life Sciences—Text Mining, Barcelona Supercomputing Center, Plaça Eusebi Güell, 1-3, Barcelona 08034, Spain
| | - Monica Muñoz
- Division of Pharmacovigilance, Office of Surveillance and Epidemiology, Center of Drug Evaluation and Research, FDA, 10903 New Hampshire Ave, Silver Spring, MD 20993, USA
| | - Raul Rodriguez-Esteban
- Roche Innovation Center Basel, Roche Pharmaceuticals, Grenzacherstrasse 124, Basel 4070, Switzerland
| | - Özlem Uzuner
- Information Sciences and Technology, George Mason University, 4400 University Dr, Fairfax, VA 22030, USA
| | - Lynette Hirschman
- MITRE Labs, The MITRE Corporation, 202 Burlington Rd., Bedford, MA 01730, USA
| |
Collapse
|
9
|
Detecting Personal Medication Intake in Twitter via Domain Attention-Based RNN with Multi-Level Features. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:5467262. [PMID: 35983151 PMCID: PMC9381240 DOI: 10.1155/2022/5467262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/11/2022] [Revised: 07/08/2022] [Accepted: 07/13/2022] [Indexed: 11/17/2022]
Abstract
Personal medication intake detection aims to automatically detect tweets that show clear evidence of personal medication consumption. It is a research topic that has attracted considerable attention to drug safety surveillance. This task is inevitably dependent on medical domain information, and the current main model for this task does not explicitly consider domain information. To tackle this problem, we propose a domain attention mechanism for recurrent neural networks, LSTMs, with a multi-level feature representation of Twitter data. Specifically, we utilize character-level CNN to capture morphological features at the word level. Subsequently, we feed them with word embeddings into a BiLSTM to get the hidden representation of a tweet. An attention mechanism is introduced over the hidden state of the BiLSTM to attend to special medical information. Finally, a classification is performed on the weighted hidden representation of tweets. Experiments over a publicly available benchmark dataset show that our model can exploit a domain attention mechanism to consider medical information to improve performance. For example, our approach achieves a precision score of 0.708, a recall score of 0.694, and a F1 score of 0.697, which is significantly outperforming multiple strong and relevant baselines.
Collapse
|
10
|
Guo Y, Ge Y, Yang YC, Al-Garadi MA, Sarker A. Comparison of Pretraining Models and Strategies for Health-Related Social Media Text Classification. Healthcare (Basel) 2022; 10:healthcare10081478. [PMID: 36011135 PMCID: PMC9408372 DOI: 10.3390/healthcare10081478] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 07/29/2022] [Accepted: 08/02/2022] [Indexed: 11/24/2022] Open
Abstract
Pretrained contextual language models proposed in the recent past have been reported to achieve state-of-the-art performances in many natural language processing (NLP) tasks, including those involving health-related social media data. We sought to evaluate the effectiveness of different pretrained transformer-based models for social media-based health-related text classification tasks. An additional objective was to explore and propose effective pretraining strategies to improve machine learning performance on such datasets and tasks. We benchmarked six transformer-based models that were pretrained with texts from different domains and sources—BERT, RoBERTa, BERTweet, TwitterBERT, BioClinical_BERT, and BioBERT—on 22 social media-based health-related text classification tasks. For the top-performing models, we explored the possibility of further boosting performance by comparing several pretraining strategies: domain-adaptive pretraining (DAPT), source-adaptive pretraining (SAPT), and a novel approach called topic specific pretraining (TSPT). We also attempted to interpret the impacts of distinct pretraining strategies by visualizing document-level embeddings at different stages of the training process. RoBERTa outperformed BERTweet on most tasks, and better than others. BERT, TwitterBERT, BioClinical_BERT and BioBERT consistently underperformed. For pretraining strategies, SAPT performed better or comparable to the off-the-shelf models, and significantly outperformed DAPT. SAPT + TSPT showed consistently high performance, with statistically significant improvement in three tasks. Our findings demonstrate that RoBERTa and BERTweet are excellent off-the-shelf models for health-related social media text classification, and extended pretraining using SAPT and TSPT can further improve performance.
Collapse
Affiliation(s)
- Yuting Guo
- Department of Biomedical Informatics, Emory University, Atlanta, GA 30322, USA
- Correspondence:
| | - Yao Ge
- Department of Biomedical Informatics, Emory University, Atlanta, GA 30322, USA
| | - Yuan-Chi Yang
- Department of Biomedical Informatics, Emory University, Atlanta, GA 30322, USA
| | - Mohammed Ali Al-Garadi
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37240, USA
| | - Abeed Sarker
- Department of Biomedical Informatics, Emory University, Atlanta, GA 30322, USA
| |
Collapse
|
11
|
Xu D, Miller T. A simple neural vector space model for medical concept normalization using concept embeddings. J Biomed Inform 2022; 130:104080. [PMID: 35472514 PMCID: PMC9351985 DOI: 10.1016/j.jbi.2022.104080] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 04/15/2022] [Accepted: 04/19/2022] [Indexed: 11/24/2022]
Abstract
OBJECTIVE Medical concept normalization (MCN), the task of linking textual mentions to concepts in an ontology, provides a solution to unify different ways of referring to the same concept. In this paper, we present a simple neural MCN model that takes mentions as input and directly predicts concepts. MATERIALS AND METHODS We evaluate our proposed model on clinical datasets from ShARe/CLEF eHealth 2013 shared task and 2019 n2c2/OHNLP shared task track 3. Our neural MCN model consists of an encoder, and a normalized temperature-scaled softmax (NT-softmax) layer that maximizes the cosine similarity score of matching the mention to the correct concept. We adopt SAPBERT as the encoder and initialize the weights in the NT-softmax layer with pre-computed concept embeddings from SAPBERT. RESULTS Our proposed neural model achieves competitive performance on ShARe/CLEF 2013 and establishes a new state-of-the-art on 2019-n2c2-MCN. Yet this model is simpler than most prior work: it requires no complex pipelines, no hand-crafted rules, and no preprocessing, making it simpler to apply in new settings. DISCUSSION Analyses of our proposed model show that the NT-softmax is better than the conventional softmax on the MCN task, and both the CUI-less threshold parameter and the initialization of the weight vectors in the NT-softmax layer contribute to the improvements. CONCLUSION We propose a simple neural model for clinical MCN, an one-step approach with simpler inference and more effective performance than prior work. Our analyses demonstrate future work on MCN may require more effort on unseen concepts.
Collapse
Affiliation(s)
- Dongfang Xu
- Computational Health Informatics Program, Boston Children's Hospital, Boston, MA, USA; Department of Pediatrics, Harvard Medical School Boston, MA, USA.
| | - Timothy Miller
- Computational Health Informatics Program, Boston Children's Hospital, Boston, MA, USA; Department of Pediatrics, Harvard Medical School Boston, MA, USA
| |
Collapse
|
12
|
Zhao Y, Yu Y, Wang H, Li Y, Deng Y, Jiang G, Luo Y. Machine Learning in Causal Inference: Application in Pharmacovigilance. Drug Saf 2022; 45:459-476. [PMID: 35579811 PMCID: PMC9114053 DOI: 10.1007/s40264-022-01155-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/09/2022] [Indexed: 01/28/2023]
Abstract
Monitoring adverse drug events or pharmacovigilance has been promoted by the World Health Organization to assure the safety of medicines through a timely and reliable information exchange regarding drug safety issues. We aim to discuss the application of machine learning methods as well as causal inference paradigms in pharmacovigilance. We first reviewed data sources for pharmacovigilance. Then, we examined traditional causal inference paradigms, their applications in pharmacovigilance, and how machine learning methods and causal inference paradigms were integrated to enhance the performance of traditional causal inference paradigms. Finally, we summarized issues with currently mainstream correlation-based machine learning models and how the machine learning community has tried to address these issues by incorporating causal inference paradigms. Our literature search revealed that most existing data sources and tasks for pharmacovigilance were not designed for causal inference. Additionally, pharmacovigilance was lagging in adopting machine learning-causal inference integrated models. We highlight several currently trending directions or gaps to integrate causal inference with machine learning in pharmacovigilance research. Finally, our literature search revealed that the adoption of causal paradigms can mitigate known issues with machine learning models. We foresee that the pharmacovigilance domain can benefit from the progress in the machine learning field.
Collapse
Affiliation(s)
- Yiqing Zhao
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, 750 N Lake Shore Drive, Room 11-189, Chicago, IL, 60611, USA
| | - Yue Yu
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN, 55902, USA
| | - Hanyin Wang
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, 750 N Lake Shore Drive, Room 11-189, Chicago, IL, 60611, USA
| | - Yikuan Li
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, 750 N Lake Shore Drive, Room 11-189, Chicago, IL, 60611, USA
| | - Yu Deng
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, 750 N Lake Shore Drive, Room 11-189, Chicago, IL, 60611, USA
| | - Guoqian Jiang
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN, 55902, USA
| | - Yuan Luo
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, 750 N Lake Shore Drive, Room 11-189, Chicago, IL, 60611, USA.
| |
Collapse
|
13
|
Identifying Adverse Drug Reaction-Related Text from Social Media: A Multi-View Active Learning Approach with Various Document Representations. INFORMATION 2022. [DOI: 10.3390/info13040189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Adverse drug reactions (ADRs) are a huge public health issue. Identifying text that mentions ADRs from a large volume of social media data is important. However, we need to address two challenges for high-performing ADR-related text detection: the data imbalance problem and the requirement of simultaneously using data-driven information and handcrafted information. Therefore, we propose an approach named multi-view active learning using domain-specific and data-driven document representations (MVAL4D), endeavoring to enhance the predictive capability and alleviate the requirement of labeled data. Specifically, a new view-generation mechanism is proposed to generate multiple views by simultaneously exploiting various document representations obtained using handcrafted feature engineering and by performing deep learning methods. Moreover, different from previous active learning studies in which all instances are chosen using the same selection criterion, MVAL4D adopts different criteria (i.e., confidence and informativeness) to select potentially positive instances and potentially negative instances for manual annotation. The experimental results verify the effectiveness of MVAL4D. The proposed approach can be generalized to many other text classification tasks. Moreover, it can offer a solid foundation for the ADR mention extraction task, and improve the feasibility of monitoring drug safety using social media data.
Collapse
|
14
|
Jha K, Zhang A. Continual knowledge infusion into pre-trained biomedical language models. Bioinformatics 2022; 38:494-502. [PMID: 34554186 DOI: 10.1093/bioinformatics/btab671] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Revised: 09/12/2021] [Accepted: 09/20/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Biomedical language models produce meaningful concept representations that are useful for a variety of biomedical natural language processing (bioNLP) applications such as named entity recognition, relationship extraction and question answering. Recent research trends have shown that the contextualized language models (e.g. BioBERT, BioELMo) possess tremendous representational power and are able to achieve impressive accuracy gains. However, these models are still unable to learn high-quality representations for concepts with low context information (i.e. rare words). Infusing the complementary information from knowledge-bases (KBs) is likely to be helpful when the corpus-specific information is insufficient to learn robust representations. Moreover, as the biomedical domain contains numerous KBs, it is imperative to develop approaches that can integrate the KBs in a continual fashion. RESULTS We propose a new representation learning approach that progressively fuses the semantic information from multiple KBs into the pretrained biomedical language models. Since most of the KBs in the biomedical domain are expressed as parent-child hierarchies, we choose to model the hierarchical KBs and propose a new knowledge modeling strategy that encodes their topological properties at a granular level. Moreover, the proposed continual learning technique efficiently updates the concepts representations to accommodate the new knowledge while preserving the memory efficiency of contextualized language models. Altogether, the proposed approach generates knowledge-powered embeddings with high fidelity and learning efficiency. Extensive experiments conducted on bioNLP tasks validate the efficacy of the proposed approach and demonstrates its capability in generating robust concept representations.
Collapse
Affiliation(s)
- Kishlay Jha
- Department of Computer Science, University of Virginia, Charlottesville, VA 22903, USA
| | - Aidong Zhang
- Department of Computer Science, University of Virginia, Charlottesville, VA 22903, USA
| |
Collapse
|
15
|
Liang L, Hu J, Sun G, Hong N, Wu G, He Y, Li Y, Hao T, Liu L, Gong M. Artificial Intelligence-Based Pharmacovigilance in the Setting of Limited Resources. Drug Saf 2022; 45:511-519. [PMID: 35579814 PMCID: PMC9112260 DOI: 10.1007/s40264-022-01170-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/27/2022] [Indexed: 01/28/2023]
Abstract
With the rapid development of artificial intelligence (AI) technologies, and the large amount of pharmacovigilance-related data stored in an electronic manner, data-driven automatic methods need to be urgently applied to all aspects of pharmacovigilance to assist healthcare professionals. However, the quantity and quality of data directly affect the performance of AI, and there are particular challenges to implementing AI in limited-resource settings. Analyzing challenges and solutions for AI-based pharmacovigilance in resource-limited settings can improve pharmacovigilance frameworks and capabilities in these settings. In this review, we summarize the challenges into four categories: establishing a database for an AI-based pharmacovigilance system, lack of human resources, weak AI technology and insufficient government support. This study also discusses possible solutions and future perspectives on AI-based pharmacovigilance in resource-limited settings.
Collapse
Affiliation(s)
- Likeng Liang
- School of Computer Science, South China Normal University, Guangzhou, China
| | - Jifa Hu
- The Central Hospital of Wuhan, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Gang Sun
- Key Laboratory of Oncology of Xinjiang Uyghur Autonomous Region, The Affiliated Cancer Hospital of Xinjiang Medical University, Ürümqi, China
| | - Na Hong
- Digital Health China Technologies Co., Ltd., Beijing, China
| | - Ge Wu
- Digital Health China Technologies Co., Ltd., Beijing, China
| | - Yuejun He
- Digital Health China Technologies Co., Ltd., Beijing, China
| | - Yong Li
- School of Computer Science, South China Normal University, Guangzhou, China
| | - Tianyong Hao
- School of Computer Science, South China Normal University, Guangzhou, China
| | - Li Liu
- Institute of Health Management, Southern Medical University, Guangzhou, China
| | - Mengchun Gong
- Institute of Health Management, Southern Medical University, Guangzhou, China
| |
Collapse
|
16
|
Grissette H, Nfaoui EH. Affective Concept-Based Encoding of Patient Narratives via Sentic Computing and Neural Networks. Cognit Comput 2021; 14:274-299. [PMID: 34422122 PMCID: PMC8371039 DOI: 10.1007/s12559-021-09903-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 06/23/2021] [Indexed: 11/30/2022]
Abstract
The automatic generation of features without human intervention is the most critical task for biomedical sentiment analysis. Regarding the high dynamicity of shared patient narrative data, the lack of formal medical language sentiment dictionaries prevents retrieval of the appropriate sentiment, which is unapproachable and can be prone to annotator bias. We propose a novel affective biomedical concept-based encoding via sentic computing and neural networks. The main contributions include four aspects. First, a biomedical embedding, in which a medical entity is defined, normalized, and synthesized from a text, is built using online patient narratives after being combined with label propagation from a widely used comprehensive biomedical vocabulary. Second, considering the dependence on biomedical definitions, drug reaction sample selection based on general matching is suggested. These feature settings are then used to build and recognize affective semantics and sentics based on an extreme learning machine. Finally, a semisupervised LSTM-BiLSTM model for biomedical sentiment analysis is constructed. There was a massive influx of patient self-reports related to the COVID-19 pandemic. A study was conducted in this direction, and we tested the validity, medical language familiarity, and transferability of our approach by analyzing millions of COVID-19 tweets. Comparisons to affective lexicons also indicate that integrating extreme learning machine cognitive capabilities has advantages over biomedical sentiment analysis. By considering sentics vectors on top of the formed embeddings, our semisupervised LSTM-BiLSTM achieved an accuracy of 87.5%. The evaluations of unsupervised learning approximated the results of the previous model when dealing with a serious loss of biomedical data. In this paper, we demonstrate the effectiveness of integrating deep-learning-based cognitive capabilities for both enhancing distributed biomedical definitions and inferring sentiment compositions from many patient self-reports on social networks. The relevant encoding of affective information conveyed regarding medication subjects clearly reveals defined roles and expectations that can have a positive impact on public health.
Collapse
Affiliation(s)
- Hanane Grissette
- LISAC Laboratory, Faculty of Sciences Dhar EL Mahraz, Sidi Mohamed Ben Abdellah University, Fez, Morocco
| | - El Habib Nfaoui
- LISAC Laboratory, Faculty of Sciences Dhar EL Mahraz, Sidi Mohamed Ben Abdellah University, Fez, Morocco
| |
Collapse
|
17
|
Gattepaille LM, Hedfors Vidlin S, Bergvall T, Pierce CE, Ellenius J. Prospective Evaluation of Adverse Event Recognition Systems in Twitter: Results from the Web-RADR Project. Drug Saf 2021; 43:797-808. [PMID: 32410156 PMCID: PMC7395913 DOI: 10.1007/s40264-020-00942-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Introduction A large number of studies on systems to detect and sometimes normalize adverse events (AEs) in social media have been published, but evidence of their practical utility is scarce. This raises the question of the transferability of such systems to new settings. Objectives The aims of this study were to develop an AE recognition system, prospectively evaluate its performance on an external benchmark dataset and identify potential factors influencing the transferability of AE recognition systems. Methods A pipeline based on dictionary lookups and logistic regression classifiers was developed using a proprietary dataset of 196,533 Tweets manually annotated for AE relations and prospectively evaluated the system on the publicly available WEB-RADR reference dataset, exploring different aspects affecting transferability. Results Our system achieved 0.53 precision, 0.52 recall and 0.52 F1-score on the development test set; however, when applied to the WEB-RADR reference dataset, system performance dropped to 0.38 precision, 0.20 recall and 0.26 F1-score. Similarly, a previously published method aiming at automatically detecting adverse event posts reported 0.5 precision, 0.92 recall and 0.65 F1-score on thus another dataset, while performance on the WEB-RADR reference dataset was reduced to 0.37 precision, 0.63 recall and 0.46 F1-score. We identified four potential factors leading to poor transferability: overfitting, selection bias, label bias and prevalence. Conclusion We warn the community about a potentially large discrepancy between the expected performance of automated AE recognition systems based on published results and the actual observed performance on independent data. This study highlights the difficulty of implementing an all-purpose system for automatic adverse event recognition in Twitter, which could explain the lack of such systems in practical pharmacovigilance settings. Our recommendation is to use benchmark independent datasets, such as the WEB-RADR reference, to investigate the transferability of the adverse event recognition systems and ultimately enforce rigorous comparisons across studies on the task. Electronic supplementary material The online version of this article (10.1007/s40264-020-00942-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | - Tomas Bergvall
- Uppsala Monitoring Centre, Box 1051, 75140, Uppsala, Sweden
| | | | - Johan Ellenius
- Uppsala Monitoring Centre, Box 1051, 75140, Uppsala, Sweden
| |
Collapse
|
18
|
Wu J, Sivaraman V, Kumar D, Banda JM, Sontag D. Pulse of the pandemic: Iterative topic filtering for clinical information extraction from social media. J Biomed Inform 2021; 120:103844. [PMID: 34153432 PMCID: PMC9339268 DOI: 10.1016/j.jbi.2021.103844] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 06/02/2021] [Accepted: 06/15/2021] [Indexed: 12/31/2022]
Abstract
The rapid evolution of the COVID-19 pandemic has underscored the need to quickly disseminate the latest clinical knowledge during a public-health emergency. One surprisingly effective platform for healthcare professionals (HCPs) to share knowledge and experiences from the front lines has been social media (for example, the "#medtwitter" community on Twitter). However, identifying clinically-relevant content in social media without manual labeling is a challenge because of the sheer volume of irrelevant data. We present an unsupervised, iterative approach to mine clinically relevant information from social media data, which begins by heuristically filtering for HCP-authored texts and incorporates topic modeling and concept extraction with MetaMap. This approach identifies granular topics and tweets with high clinical relevance from a set of about 52 million COVID-19-related tweets from January to mid-June 2020. We also show that because the technique does not require manual labeling, it can be used to identify emerging topics on a week-to-week basis. Our method can aid in future public-health emergencies by facilitating knowledge transfer among healthcare workers in a rapidly-changing information environment, and by providing an efficient and unsupervised way of highlighting potential areas for clinical research.
Collapse
Affiliation(s)
- Julia Wu
- Dept of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge 02139, MA, USA
| | - Venkatesh Sivaraman
- Human-Computer Interaction Institute, Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh 15213, PA, USA
| | - Dheekshita Kumar
- Dept of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge 02139, MA, USA
| | - Juan M Banda
- Dept of Computer Science, Georgia State University, 25 Park Place, Atlanta 30303, GA, USA
| | - David Sontag
- Dept of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge 02139, MA, USA.
| |
Collapse
|
19
|
Magge A, Tutubalina E, Miftahutdinov Z, Alimova I, Dirkson A, Verberne S, Weissenbacher D, Gonzalez-Hernandez G. DeepADEMiner: a deep learning pharmacovigilance pipeline for extraction and normalization of adverse drug event mentions on Twitter. J Am Med Inform Assoc 2021; 28:2184-2192. [PMID: 34270701 PMCID: PMC8449608 DOI: 10.1093/jamia/ocab114] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 05/20/2021] [Accepted: 06/08/2021] [Indexed: 11/17/2022] Open
Abstract
Objective Research on pharmacovigilance from social media data has focused on mining adverse drug events (ADEs) using annotated datasets, with publications generally focusing on 1 of 3 tasks: ADE classification, named entity recognition for identifying the span of ADE mentions, and ADE mention normalization to standardized terminologies. While the common goal of such systems is to detect ADE signals that can be used to inform public policy, it has been impeded largely by limited end-to-end solutions for large-scale analysis of social media reports for different drugs. Materials and Methods We present a dataset for training and evaluation of ADE pipelines where the ADE distribution is closer to the average ‘natural balance’ with ADEs present in about 7% of the tweets. The deep learning architecture involves an ADE extraction pipeline with individual components for all 3 tasks. Results The system presented achieved state-of-the-art performance on comparable datasets and scored a classification performance of F1 = 0.63, span extraction performance of F1 = 0.44 and an end-to-end entity resolution performance of F1 = 0.34 on the presented dataset. Discussion The performance of the models continues to highlight multiple challenges when deploying pharmacovigilance systems that use social media data. We discuss the implications of such models in the downstream tasks of signal detection and suggest future enhancements. Conclusion Mining ADEs from Twitter posts using a pipeline architecture requires the different components to be trained and tuned based on input data imbalance in order to ensure optimal performance on the end-to-end resolution task.
Collapse
Affiliation(s)
- Arjun Magge
- DBEI, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | | | | | | | | | | | - Davy Weissenbacher
- DBEI, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | | |
Collapse
|
20
|
Dietrich J, Gattepaille LM, Grum BA, Jiri L, Lerch M, Sartori D, Wisniewski A. Adverse Events in Twitter-Development of a Benchmark Reference Dataset: Results from IMI WEB-RADR. Drug Saf 2021; 43:467-478. [PMID: 31997289 PMCID: PMC7165158 DOI: 10.1007/s40264-020-00912-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Introduction and Objective Social media has been suggested as a source for safety information, supplementing existing safety surveillance data sources. This article summarises the activities undertaken, and the associated challenges, to create a benchmark reference dataset that can be used to evaluate the performance of automated methods and systems for adverse event recognition. Methods A retrospective analysis of public English-language Twitter posts (Tweets) was performed. We sampled 57,473 Tweets out of 5,645,336 Tweets created between 1 March, 2012 and 1 March, 2015 that mentioned at least one of six medicinal products of interest (insulin glargine, levetiracetam, methylphenidate, sorafenib, terbinafine, zolpidem). Products, adverse events, indications, product-event combinations, and product-indication combinations were extracted and coded by two independent teams of safety reviewers. Results The benchmark reference dataset consisted of 1056 positive controls (“adverse event Tweets”) and 56,417 negative controls (“non-adverse event Tweets”). The 1056 adverse event Tweets contained 1396 product-event combinations referring to personal adverse event experiences, comprising 292 different MedDRA® Preferred Terms. The 1171 product-event combinations (83.9%) were confined to four MedDRA® System Organ Classes. The 195 Tweets (18.5%) contained indication information, comprising 25 different Preferred Terms. Conclusions A manually curated benchmark reference dataset based on Twitter data has been created and is made available to the research community to evaluate the performance of automated methods and systems for adverse event recognition in unstructured free-text information. Electronic supplementary material The online version of this article (10.1007/s40264-020-00912-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Juergen Dietrich
- Pharmacovigilance, Bayer AG, Müllerstr. 170, 13353, Berlin, Germany.
| | | | - Britta Anne Grum
- Pharmacovigilance, Bayer AG, Müllerstr. 170, 13353, Berlin, Germany
| | - Letitia Jiri
- Global Patient Safety Pharmacovigilance Operations, Amgen Limited, Cambridge, UK
| | | | | | - Antoni Wisniewski
- Global Regulatory Affairs, Patient Safety and Quality Assurance, Global Medicines Development, AstraZeneca, Cambridge, UK
| |
Collapse
|
21
|
Tutubalina E, Alimova I, Miftahutdinov Z, Sakhovskiy A, Malykh V, Nikolenko S. The Russian Drug Reaction Corpus and neural models for drug reactions and effectiveness detection in user reviews. Bioinformatics 2021; 37:243-249. [PMID: 32722774 DOI: 10.1093/bioinformatics/btaa675] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2020] [Revised: 07/14/2020] [Accepted: 07/20/2020] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Drugs and diseases play a central role in many areas of biomedical research and healthcare. Aggregating knowledge about these entities across a broader range of domains and languages is critical for information extraction (IE) applications. To facilitate text mining methods for analysis and comparison of patient's health conditions and adverse drug reactions reported on the Internet with traditional sources such as drug labels, we present a new corpus of Russian language health reviews. RESULTS The Russian Drug Reaction Corpus (RuDReC) is a new partially annotated corpus of consumer reviews in Russian about pharmaceutical products for the detection of health-related named entities and the effectiveness of pharmaceutical products. The corpus itself consists of two parts, the raw one and the labeled one. The raw part includes 1.4 million health-related user-generated texts collected from various Internet sources, including social media. The labeled part contains 500 consumer reviews about drug therapy with drug- and disease-related information. Labels for sentences include health-related issues or their absence. The sentences with one are additionally labeled at the expression level for identification of fine-grained subtypes such as drug classes and drug forms, drug indications and drug reactions. Further, we present a baseline model for named entity recognition (NER) and multilabel sentence classification tasks on this corpus. The macro F1 score of 74.85% in the NER task was achieved by our RuDR-BERT model. For the sentence classification task, our model achieves the macro F1 score of 68.82% gaining 7.47% over the score of BERT model trained on Russian data. AVAILABILITY AND IMPLEMENTATION We make the RuDReC corpus and pretrained weights of domain-specific BERT models freely available at https://github.com/cimm-kzn/RuDReC. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Elena Tutubalina
- Chemoinformatics and Molecular Modeling Laboratory, The Alexander Butlerov Institute of Chemistry, Kazan Federal University, Kazan 420008, Russian Federation
| | - Ilseyar Alimova
- Chemoinformatics and Molecular Modeling Laboratory, The Alexander Butlerov Institute of Chemistry, Kazan Federal University, Kazan 420008, Russian Federation
| | - Zulfat Miftahutdinov
- Chemoinformatics and Molecular Modeling Laboratory, The Alexander Butlerov Institute of Chemistry, Kazan Federal University, Kazan 420008, Russian Federation
| | - Andrey Sakhovskiy
- Chemoinformatics and Molecular Modeling Laboratory, The Alexander Butlerov Institute of Chemistry, Kazan Federal University, Kazan 420008, Russian Federation
| | - Valentin Malykh
- Chemoinformatics and Molecular Modeling Laboratory, The Alexander Butlerov Institute of Chemistry, Kazan Federal University, Kazan 420008, Russian Federation
| | - Sergey Nikolenko
- Chemoinformatics and Molecular Modeling Laboratory, The Alexander Butlerov Institute of Chemistry, Kazan Federal University, Kazan 420008, Russian Federation.,Samsung-PDMI AI Center, Steklov Institute of Mathematics at St. Petersburg, St. Petersburg 191023, Russian Federation
| |
Collapse
|
22
|
Lwowski B, Rios A. The risk of racial bias while tracking influenza-related content on social media using machine learning. J Am Med Inform Assoc 2021; 28:839-849. [PMID: 33484133 PMCID: PMC7973478 DOI: 10.1093/jamia/ocaa326] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2020] [Accepted: 12/08/2020] [Indexed: 11/13/2022] Open
Abstract
OBJECTIVE Machine learning is used to understand and track influenza-related content on social media. Because these systems are used at scale, they have the potential to adversely impact the people they are built to help. In this study, we explore the biases of different machine learning methods for the specific task of detecting influenza-related content. We compare the performance of each model on tweets written in Standard American English (SAE) vs African American English (AAE). MATERIALS AND METHODS Two influenza-related datasets are used to train 3 text classification models (support vector machine, convolutional neural network, bidirectional long short-term memory) with different feature sets. The datasets match real-world scenarios in which there is a large imbalance between SAE and AAE examples. The number of AAE examples for each class ranges from 2% to 5% in both datasets. We also evaluate each model's performance using a balanced dataset via undersampling. RESULTS We find that all of the tested machine learning methods are biased on both datasets. The difference in false positive rates between SAE and AAE examples ranges from 0.01 to 0.35. The difference in the false negative rates ranges from 0.01 to 0.23. We also find that the neural network methods generally has more unfair results than the linear support vector machine on the chosen datasets. CONCLUSIONS The models that result in the most unfair predictions may vary from dataset to dataset. Practitioners should be aware of the potential harms related to applying machine learning to health-related social media data. At a minimum, we recommend evaluating fairness along with traditional evaluation metrics.
Collapse
Affiliation(s)
- Brandon Lwowski
- Department of Information Systems and Cyber Security, University of Texas at San Antonio, San Antonio, Texas, USA
| | - Anthony Rios
- Department of Information Systems and Cyber Security, University of Texas at San Antonio, San Antonio, Texas, USA
| |
Collapse
|
23
|
Sarker A, DeRoos A, Perrone J. Mining social media for prescription medication abuse monitoring: a review and proposal for a data-centric framework. J Am Med Inform Assoc 2021; 27:315-329. [PMID: 31584645 PMCID: PMC7025330 DOI: 10.1093/jamia/ocz162] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Revised: 08/14/2019] [Indexed: 01/02/2023] Open
Abstract
Objective Prescription medication (PM) misuse and abuse is a major health problem globally, and a number of recent studies have focused on exploring social media as a resource for monitoring nonmedical PM use. Our objectives are to present a methodological review of social media–based PM abuse or misuse monitoring studies, and to propose a potential generalizable, data-centric processing pipeline for the curation of data from this resource. Materials and Methods We identified studies involving social media, PMs, and misuse or abuse (inclusion criteria) from Medline, Embase, Scopus, Web of Science, and Google Scholar. We categorized studies based on multiple characteristics including but not limited to data size; social media source(s); medications studied; and primary objectives, methods, and findings. Results A total of 39 studies met our inclusion criteria, with 31 (∼79.5%) published since 2015. Twitter has been the most popular resource, with Reddit and Instagram gaining popularity recently. Early studies focused mostly on manual, qualitative analyses, with a growing trend toward the use of data-centric methods involving natural language processing and machine learning. Discussion There is a paucity of standardized, data-centric frameworks for curating social media data for task-specific analyses and near real-time surveillance of nonmedical PM use. Many existing studies do not quantify human agreements for manual annotation tasks or take into account the presence of noise in data. Conclusion The development of reproducible and standardized data-centric frameworks that build on the current state-of-the-art methods in data and text mining may enable effective utilization of social media data for understanding and monitoring nonmedical PM use.
Collapse
Affiliation(s)
- Abeed Sarker
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, Georgia, USA
| | - Annika DeRoos
- College of Arts and Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Jeanmarie Perrone
- Department of Emergency Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
24
|
Digan W, Névéol A, Neuraz A, Wack M, Baudoin D, Burgun A, Rance B. Can reproducibility be improved in clinical natural language processing? A study of 7 clinical NLP suites. J Am Med Inform Assoc 2021; 28:504-515. [PMID: 33319904 PMCID: PMC7936396 DOI: 10.1093/jamia/ocaa261] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Indexed: 11/24/2022] Open
Abstract
Background The increasing complexity of data streams and computational processes in modern clinical health information systems makes reproducibility challenging. Clinical natural language processing (NLP) pipelines are routinely leveraged for the secondary use of data. Workflow management systems (WMS) have been widely used in bioinformatics to handle the reproducibility bottleneck. Objective To evaluate if WMS and other bioinformatics practices could impact the reproducibility of clinical NLP frameworks. Materials and Methods Based on the literature across multiple researcho fields (NLP, bioinformatics and clinical informatics) we selected articles which (1) review reproducibility practices and (2) highlight a set of rules or guidelines to ensure tool or pipeline reproducibility. We aggregate insight from the literature to define reproducibility recommendations. Finally, we assess the compliance of 7 NLP frameworks to the recommendations. Results We identified 40 reproducibility features from 8 selected articles. Frameworks based on WMS match more than 50% of features (26 features for LAPPS Grid, 22 features for OpenMinted) compared to 18 features for current clinical NLP framework (cTakes, CLAMP) and 17 features for GATE, ScispaCy, and Textflows. Discussion 34 recommendations are endorsed by at least 2 articles from our selection. Overall, 15 features were adopted by every NLP Framework. Nevertheless, frameworks based on WMS had a better compliance with the features. Conclusion NLP frameworks could benefit from lessons learned from the bioinformatics field (eg, public repositories of curated tools and workflows or use of containers for shareability) to enhance the reproducibility in a clinical setting.
Collapse
Affiliation(s)
- William Digan
- INSERM, Centre de Recherche des Cordeliers, UMRS 1138, Université de Paris, Université Sorbonne Paris Cité, Paris, France.,Department of Medical Informatics, Hôpital Européen Georges Pompidou, Assistance publique-Hôpitaux de Paris, Paris, France
| | | | - Antoine Neuraz
- INSERM, Centre de Recherche des Cordeliers, UMRS 1138, Université de Paris, Université Sorbonne Paris Cité, Paris, France.,Department of Medical Informatics, Necker Children's Hospital, Assistance publique-Hôpitaux de Paris, Paris, France
| | - Maxime Wack
- INSERM, Centre de Recherche des Cordeliers, UMRS 1138, Université de Paris, Université Sorbonne Paris Cité, Paris, France.,Department of Medical Informatics, Hôpital Européen Georges Pompidou, Assistance publique-Hôpitaux de Paris, Paris, France
| | - David Baudoin
- INSERM, Centre de Recherche des Cordeliers, UMRS 1138, Université de Paris, Université Sorbonne Paris Cité, Paris, France.,Department of Medical Informatics, Hôpital Européen Georges Pompidou, Assistance publique-Hôpitaux de Paris, Paris, France
| | - Anita Burgun
- INSERM, Centre de Recherche des Cordeliers, UMRS 1138, Université de Paris, Université Sorbonne Paris Cité, Paris, France.,Department of Medical Informatics, Hôpital Européen Georges Pompidou, Assistance publique-Hôpitaux de Paris, Paris, France.,Department of Medical Informatics, Necker Children's Hospital, Assistance publique-Hôpitaux de Paris, Paris, France
| | - Bastien Rance
- INSERM, Centre de Recherche des Cordeliers, UMRS 1138, Université de Paris, Université Sorbonne Paris Cité, Paris, France.,Department of Medical Informatics, Hôpital Européen Georges Pompidou, Assistance publique-Hôpitaux de Paris, Paris, France
| |
Collapse
|
25
|
Wu J, Sivaraman V, Kumar D, Banda JM, Sontag D. Pulse of the Pandemic: Iterative Topic Filtering for Clinical Information Extraction from Social Media. ARXIV 2021:arXiv:2102.06836v2. [PMID: 33594339 PMCID: PMC7885911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Figures] [Subscribe] [Scholar Register] [Revised: 06/28/2021] [Indexed: 06/12/2023]
Abstract
The rapid evolution of the COVID-19 pandemic has underscored the need to quickly disseminate the latest clinical knowledge during a public-health emergency. One surprisingly effective platform for healthcare professionals (HCPs) to share knowledge and experiences from the front lines has been social media (for example, the "#medtwitter" community on Twitter). However, identifying clinically-relevant content in social media without manual labeling is a challenge because of the sheer volume of irrelevant data. We present an unsupervised, iterative approach to mine clinically relevant information from social media data, which begins by heuristically filtering for HCP-authored texts and incorporates topic modeling and concept extraction with MetaMap. This approach identifies granular topics and tweets with high clinical relevance from a set of about 52 million COVID-19-related tweets from January to mid-June 2020. We also show that because the technique does not require manual labeling, it can be used to identify emerging topics on a week-to-week basis. Our method can aid in future public-health emergencies by facilitating knowledge transfer among healthcare workers in a rapidly-changing information environment, and by providing an efficient and unsupervised way of highlighting potential areas for clinical research.
Collapse
Affiliation(s)
- Julia Wu
- Dept of Electrical Engineering and Computer Science, Massachusetts Institute of Technology
| | | | - Dheekshita Kumar
- Dept of Electrical Engineering and Computer Science, Massachusetts Institute of Technology
| | - Juan M Banda
- Department of Computer Science, Georgia State University
| | - David Sontag
- Dept of Electrical Engineering and Computer Science, Massachusetts Institute of Technology
| |
Collapse
|
26
|
Bulcock A, Hassan L, Giles S, Sanders C, Nenadic G, Campbell S, Dixon W. Public Perspectives of Using Social Media Data to Improve Adverse Drug Reaction Reporting: A Mixed-Methods Study. Drug Saf 2021; 44:553-564. [PMID: 33582973 PMCID: PMC8053157 DOI: 10.1007/s40264-021-01042-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/11/2021] [Indexed: 11/30/2022]
Abstract
Introduction Information on suspected adverse drug reactions (ADRs) voluntarily submitted by patients can be a valuable source of information for improving drug safety; however, public awareness of reporting mechanisms remains low. Whilst methods to automatically detect ADR mentions from social media posts using text mining techniques have been proposed to improve reporting rates, it is unclear how acceptable these would be to social media users. Objective The objective of this study was to explore public opinion about using automated methods to detect and report mentions of ADRs on social media to enhance pharmacovigilance efforts. Methods Users of the online health discussion forum HealthUnlocked participated in an online survey (N = 1359) about experiences with ADRs, knowledge of pharmacovigilance methods, and opinions about using automated data mining methods to detect and report ADRs. To further explore responses, five qualitative focus groups were conducted with 20 social media users with long-term health conditions. Results Participant responses indicated a low awareness of pharmacovigilance methods and ADR reporting. They showed a strong willingness to share health-related social media data about ADRs with researchers and regulators, but were cautious about automated text mining methods of detecting and reporting ADRs. Conclusions Social media users value public-facing pharmacovigilance schemes, even if they do not understand the current framework of pharmacovigilance within the UK. Ongoing engagement with users is essential to understand views, share knowledge and respect users’ privacy expectations to optimise future ADR reporting from online health communities. Supplementary Information The online version contains supplementary material available at 10.1007/s40264-021-01042-6.
Collapse
Affiliation(s)
- Alexander Bulcock
- Health Education England, North West Deanery, UK
- Division of Musculoskeletal and Dermatological Sciences, Centre for Epidemiology Versus Arthritis, The University of Manchester, Oxford Road, Manchester, M13 9PL, UK
| | - Lamiece Hassan
- Division of Informatics, Imaging and Data Sciences, Centre for Health Informatics, The University of Manchester, Manchester, UK
| | - Sally Giles
- Division of Population Health, Health Services Research and Primary Care, NIHR Greater Manchester Primary Care Patient Safety Translational Research Centre, The University of Manchester, Manchester, UK
| | - Caroline Sanders
- Division of Population Health, Health Services Research and Primary Care, NIHR Greater Manchester Primary Care Patient Safety Translational Research Centre, The University of Manchester, Manchester, UK
| | - Goran Nenadic
- School of Computer Science, The University of Manchester, Manchester, UK
| | - Stephen Campbell
- Division of Population Health, Health Services Research and Primary Care, NIHR Greater Manchester Primary Care Patient Safety Translational Research Centre, The University of Manchester, Manchester, UK
| | - Will Dixon
- Division of Musculoskeletal and Dermatological Sciences, Centre for Epidemiology Versus Arthritis, The University of Manchester, Oxford Road, Manchester, M13 9PL, UK.
| |
Collapse
|
27
|
Weichselbraun A, Steixner J, Braşoveanu AMP, Scharl A, Göbel M, Nixon LJB. Automatic Expansion of Domain-Specific Affective Models for Web Intelligence Applications. Cognit Comput 2021; 14:228-245. [PMID: 33552304 PMCID: PMC7846919 DOI: 10.1007/s12559-021-09839-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2020] [Accepted: 01/12/2021] [Indexed: 11/29/2022]
Abstract
Sentic computing relies on well-defined affective models of different complexity—polarity to distinguish positive and negative sentiment, for example, or more nuanced models to capture expressions of human emotions. When used to measure communication success, even the most granular affective model combined with sophisticated machine learning approaches may not fully capture an organisation’s strategic positioning goals. Such goals often deviate from the assumptions of standardised affective models. While certain emotions such as Joy and Trust typically represent desirable brand associations, specific communication goals formulated by marketing professionals often go beyond such standard dimensions. For instance, the brand manager of a television show may consider fear or sadness to be desired emotions for its audience. This article introduces expansion techniques for affective models, combining common and commonsense knowledge available in knowledge graphs with language models and affective reasoning, improving coverage and consistency as well as supporting domain-specific interpretations of emotions. An extensive evaluation compares the performance of different expansion techniques: (i) a quantitative evaluation based on the revisited Hourglass of Emotions model to assess performance on complex models that cover multiple affective categories, using manually compiled gold standard data, and (ii) a qualitative evaluation of a domain-specific affective model for television programme brands. The results of these evaluations demonstrate that the introduced techniques support a variety of embeddings and pre-trained models. The paper concludes with a discussion on applying this approach to other scenarios where affective model resources are scarce.
Collapse
Affiliation(s)
- Albert Weichselbraun
- University of Applied Sciences of the Grisons, Chur, Switzerland.,webLyzard technology, Vienna, Austria
| | | | | | - Arno Scharl
- MODUL University Vienna, Vienna, Austria.,webLyzard technology, Vienna, Austria
| | - Max Göbel
- webLyzard technology, Vienna, Austria
| | - Lyndon J B Nixon
- MODUL Technology, Vienna, Austria.,MODUL University Vienna, Vienna, Austria
| |
Collapse
|
28
|
Al-Garadi MA, Yang YC, Cai H, Ruan Y, O'Connor K, Graciela GH, Perrone J, Sarker A. Text classification models for the automatic detection of nonmedical prescription medication use from social media. BMC Med Inform Decis Mak 2021; 21:27. [PMID: 33499852 PMCID: PMC7835447 DOI: 10.1186/s12911-021-01394-0] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 01/12/2021] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND Prescription medication (PM) misuse/abuse has emerged as a national crisis in the United States, and social media has been suggested as a potential resource for performing active monitoring. However, automating a social media-based monitoring system is challenging-requiring advanced natural language processing (NLP) and machine learning methods. In this paper, we describe the development and evaluation of automatic text classification models for detecting self-reports of PM abuse from Twitter. METHODS We experimented with state-of-the-art bi-directional transformer-based language models, which utilize tweet-level representations that enable transfer learning (e.g., BERT, RoBERTa, XLNet, AlBERT, and DistilBERT), proposed fusion-based approaches, and compared the developed models with several traditional machine learning, including deep learning, approaches. Using a public dataset, we evaluated the performances of the classifiers on their abilities to classify the non-majority "abuse/misuse" class. RESULTS Our proposed fusion-based model performs significantly better than the best traditional model (F1-score [95% CI]: 0.67 [0.64-0.69] vs. 0.45 [0.42-0.48]). We illustrate, via experimentation using varying training set sizes, that the transformer-based models are more stable and require less annotated data compared to the other models. The significant improvements achieved by our best-performing classification model over past approaches makes it suitable for automated continuous monitoring of nonmedical PM use from Twitter. CONCLUSIONS BERT, BERT-like and fusion-based models outperform traditional machine learning and deep learning models, achieving substantial improvements over many years of past research on the topic of prescription medication misuse/abuse classification from social media, which had been shown to be a complex task due to the unique ways in which information about nonmedical use is presented. Several challenges associated with the lack of context and the nature of social media language need to be overcome to further improve BERT and BERT-like models. These experimental driven challenges are represented as potential future research directions.
Collapse
Affiliation(s)
- Mohammed Ali Al-Garadi
- Department of Biomedical Informatics, School of Medicine, Emory University, 101 Woodruff Circle, Atlanta, GA, 30322, USA.
| | - Yuan-Chi Yang
- Department of Biomedical Informatics, School of Medicine, Emory University, 101 Woodruff Circle, Atlanta, GA, 30322, USA
| | - Haitao Cai
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Yucheng Ruan
- School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Karen O'Connor
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Gonzalez-Hernandez Graciela
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Jeanmarie Perrone
- Department of Emergency Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Abeed Sarker
- Department of Biomedical Informatics, School of Medicine, Emory University, 101 Woodruff Circle, Atlanta, GA, 30322, USA
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, 30322, USA
| |
Collapse
|
29
|
Datta S, Godfrey-Stovall J, Roberts K. RadLex Normalization in Radiology Reports. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2021; 2020:338-347. [PMID: 33936406 PMCID: PMC8075450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Radiology reports have been widely used for extraction of various clinically significant information about patients' imaging studies. However, limited research has focused on standardizing the entities to a common radiology-specific vocabulary. Further, no study to date has attempted to leverage RadLex for standardization. In this paper, we aim to normalize a diverse set of radiological entities to RadLex terms. We manually construct a normalization corpus by annotating entities from three types of reports. This contains 1706 entity mentions. We propose two deep learning-based NLP methods based on a pre-trained language model (BERT) for automatic normalization. First, we employ BM25 to retrieve candidate concepts for the BERT-based models (re-ranker and span detector) to predict the normalized concept. The results are promising, with the best accuracy (78.44%) obtained by the span detector. Additionally, we discuss the challenges involved in corpus construction and propose new RadLex terms.
Collapse
Affiliation(s)
- Surabhi Datta
- School of Biomedical Informatics The University of Texas Health Science Center at Houston Houston, TX
| | - Jordan Godfrey-Stovall
- School of Biomedical Informatics The University of Texas Health Science Center at Houston Houston, TX
| | - Kirk Roberts
- School of Biomedical Informatics The University of Texas Health Science Center at Houston Houston, TX
| |
Collapse
|
30
|
Bayer S, Clark C, Dang O, Aberdeen J, Brajovic S, Swank K, Hirschman L, Ball R. ADE Eval: An Evaluation of Text Processing Systems for Adverse Event Extraction from Drug Labels for Pharmacovigilance. Drug Saf 2021; 44:83-94. [PMID: 33006728 PMCID: PMC7813736 DOI: 10.1007/s40264-020-00996-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/02/2020] [Indexed: 12/05/2022]
Abstract
INTRODUCTION The US FDA is interested in a tool that would enable pharmacovigilance safety evaluators to automate the identification of adverse drug events (ADEs) mentioned in FDA prescribing information. The MITRE Corporation (MITRE) and the FDA organized a shared task-Adverse Drug Event Evaluation (ADE Eval)-to determine whether the performance of algorithms currently used for natural language processing (NLP) might be good enough for real-world use. OBJECTIVE ADE Eval was conducted to evaluate a range of NLP techniques for identifying ADEs mentioned in publicly available FDA-approved drug labels (package inserts). It was designed specifically to reflect pharmacovigilance practices within the FDA and model possible pharmacovigilance use cases. METHODS Pharmacovigilance-specific annotation guidelines and annotated corpora were created. Two metrics modeled the experiences of FDA safety evaluators: one measured the ability of an algorithm to identify correct Medical Dictionary for Regulatory Activities (MedDRA®) terms for the text from the annotated corpora, and the other assessed the quality of evidence extracted from the corpora to support the selected MedDRA® term by measuring the portion of annotated text an algorithm correctly identified. A third metric assessed the cost of correcting system output for subsequent training (averaged, weighted F1-measure for mention finding). RESULTS In total, 13 teams submitted 23 runs: the top MedDRA® coding F1-measure was 0.79, the top quality score was 0.96, and the top mention-finding F1-measure was 0.89. CONCLUSION While NLP techniques do not perform at levels that would allow them to be used without intervention, it is now worthwhile exploring making NLP outputs available in human pharmacovigilance workflows.
Collapse
Affiliation(s)
- Samuel Bayer
- The MITRE Corporation, 202 Burlington Rd, Bedford, MA 01730 USA
| | - Cheryl Clark
- The MITRE Corporation, 202 Burlington Rd, Bedford, MA 01730 USA
| | - Oanh Dang
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD USA
| | - John Aberdeen
- The MITRE Corporation, 202 Burlington Rd, Bedford, MA 01730 USA
| | - Sonja Brajovic
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD USA
| | - Kimberley Swank
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD USA
| | | | - Robert Ball
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD USA
| |
Collapse
|
31
|
C-Norm: a neural approach to few-shot entity normalization. BMC Bioinformatics 2020; 21:579. [PMID: 33372606 PMCID: PMC7771092 DOI: 10.1186/s12859-020-03886-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Accepted: 11/17/2020] [Indexed: 12/04/2022] Open
Abstract
Background Entity normalization is an important information extraction task which has gained renewed attention in the last decade, particularly in the biomedical and life science domains. In these domains, and more generally in all specialized domains, this task is still challenging for the latest machine learning-based approaches, which have difficulty handling highly multi-class and few-shot learning problems. To address this issue, we propose C-Norm, a new neural approach which synergistically combines standard and weak supervision, ontological knowledge integration and distributional semantics. Results Our approach greatly outperforms all methods evaluated on the Bacteria Biotope datasets of BioNLP Open Shared Tasks 2019, without integrating any manually-designed domain-specific rules. Conclusions Our results show that relatively shallow neural network methods can perform well in domains that present highly multi-class and few-shot learning problems.
Collapse
|
32
|
Zhou Z, Hultgren KE. Complementing the US Food and Drug Administration Adverse Event Reporting System With Adverse Drug Reaction Reporting From Social Media: Comparative Analysis. JMIR Public Health Surveill 2020; 6:e19266. [PMID: 32996889 PMCID: PMC7557434 DOI: 10.2196/19266] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Revised: 06/09/2020] [Accepted: 06/25/2020] [Indexed: 01/17/2023] Open
Abstract
Background Adverse drug reactions (ADRs) can occur any time someone uses a medication. ADRs are systematically tracked and cataloged, with varying degrees of success, in order to better understand their etiology and develop methods of prevention. The US Food and Drug Administration (FDA) has developed the FDA Adverse Event Reporting System (FAERS) for this purpose. FAERS collects information from myriad sources, but the primary reporters have traditionally been medical professionals and pharmacovigilance data from manufacturers. Recent studies suggest that information shared publicly on social media platforms related to medication use could be of benefit in complementing FAERS data in order to have a richer picture of how medications are actually being used and the experiences people are having across large populations. Objective The aim of this study is to validate the accuracy and precision of social media methodology and conduct evaluations of Twitter ADR reporting for commonly used pharmaceutical agents. Methods ADR data from the 10 most prescribed medications according to pharmacy claims data were collected from both FAERS and Twitter. In order to obtain data from FAERS, the SafeRx database, a curated collection of FAERS data, was used to collect data from March 1, 2016, to March 31, 2017. Twitter data were manually scraped during the same time period to extract similar data using an algorithm designed to minimize noise and false signals in social media data. Results A total of 40,539 FAERS ADR reports were obtained via SafeRx and more than 40,000 tweets containing the drug names were obtained from Twitter’s Advanced Search engine. While the FAERS data were specific to ADRs, the Twitter data were more limited. Only hydrocodone/acetaminophen, prednisone, amoxicillin, gabapentin, and metformin had a sufficient volume of ADR content for review and comparison. For metformin, diarrhea was the side effect that resulted in no difference between the two platforms (P=.30). For hydrocodone/acetaminophen, ineffectiveness as an ADR that resulted in no difference (P=.60). For gabapentin, there were no differences in terms of the ADRs ineffectiveness and fatigue (P=.15 and P=.67, respectively). For amoxicillin, hypersensitivity, nausea, and rash shared similar profiles between platforms (P=.35, P=.05, and P=.31, respectively). Conclusions FAERS and Twitter shared similarities in types of data reported and a few unique items to each data set as well. The use of Twitter as an ADR pharmacovigilance platform should continue to be studied as a unique and complementary source of information rather than a validation tool of existing ADR databases.
Collapse
Affiliation(s)
- Zeyun Zhou
- College of Pharmacy, Purdue University, West Lafayette, IN, United States
| | | |
Collapse
|
33
|
Gries KS, Fastenau J. Using a digital patient powered research network to identify outcomes of importance to patients with multiple myeloma. J Patient Rep Outcomes 2020; 4:74. [PMID: 32870420 PMCID: PMC7462947 DOI: 10.1186/s41687-020-00242-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2020] [Accepted: 08/24/2020] [Indexed: 02/06/2023] Open
Abstract
Background Social media platforms give patients a voice by allowing them to discuss their health and connect with others. These unfiltered and genuine reports offer direct access to what matters most to patients. Exploring the patient-reported outcomes discussed in these platforms reveal clinical insights and behavioral patterns of the real-world patient journey. This research study reviewed health-related quality of life (HRQoL) concepts reported by patients with multiple myeloma (MM). Methods Data were obtained using the Belong.life patient-powered research network (PPRN) using social media listening methods. The analysis cohort consisted of adults diagnosed with MM who signed into the Belong.life platform by June 2018. Natural language processing and medical neural networks were utilized to extract text data to mine and scan for concepts using programmed algorithms. The textual review of the data was conducted on two levels: the over-arching concept of interest (broad symptom and impact classification) and the more specific symptom and impacts report. Concepts were analyzed descriptively and summarized by age, gender, context of report, and stage of disease/treatment journey. Results Two hundred thirty patients with MM from the United States (52%), Israel (42%), Canada (3%), and 3% from Egypt, France, Greece, India, United Kingdom, and Australia were identified. A total of 57% were female and at account registration the median age was 57 years. A total of 126 patients had evaluable text data to search concepts being discussed. The PPRN platform identified 93% of the concepts from the conceptual model developed based on prior literature review. The most commonly reported symptoms were neuropathy, tiredness, nausea, back pain, fatigue, and bone pain. Back pain appeared as the most prominent symptom early in the disease and sometimes occurred prior to MM diagnosis. Tiredness, nausea, fatigue, and bone pain were frequently reported after MM diagnosis, with the start of treatment. Conclusion Patient-oriented social media platforms, such as Belong.life, can capture and contribute to a holistic vision of concepts surrounding patients’ HRQoL. The ability to understand when a certain debilitating symptom appeared and to which sub-population of patients may allow for a personalized approach to treatment, improving adherence and quality of care as well as increasing patient well-being.
Collapse
Affiliation(s)
| | - John Fastenau
- Janssen Global Services, 700 US-202, Raritan, NJ, 08869, USA
| |
Collapse
|
34
|
Use of Social Media for Pharmacovigilance Activities: Key Findings and Recommendations from the Vigi4Med Project. Drug Saf 2020; 43:835-851. [PMID: 32557179 DOI: 10.1007/s40264-020-00951-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
The large-scale use of social media by the population has gained the attention of stakeholders and researchers in various fields. In the domain of pharmacovigilance, this new resource was initially considered as an opportunity to overcome underreporting and monitor the safety of drugs in real time in close connection with patients. Research is still required to overcome technical challenges related to data extraction, annotation, and filtering, and there is not yet a clear consensus concerning the systematic exploration and use of social media in pharmacovigilance. Although the literature has mainly considered signal detection, the potential value of social media to support other pharmacovigilance activities should also be explored. The objective of this paper is to present the main findings and subsequent recommendations from the French research project Vigi4Med, which evaluated the use of social media, mainly web forums, for pharmacovigilance activities. This project included an analysis of the existing literature, which contributed to the recommendations presented herein. The recommendations are categorized into three categories: ethical (related to privacy, confidentiality, and follow-up), qualitative (related to the quality of the information), and quantitative (related to statistical analysis). We argue that the progress in information technology and the societal need to consider patients' experiences should motivate future research on social media surveillance for the reinforcement of classical pharmacovigilance.
Collapse
|
35
|
O'Connor K, Sarker A, Perrone J, Gonzalez Hernandez G. Promoting Reproducible Research for Characterizing Nonmedical Use of Medications Through Data Annotation: Description of a Twitter Corpus and Guidelines. J Med Internet Res 2020; 22:e15861. [PMID: 32130117 PMCID: PMC7066507 DOI: 10.2196/15861] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Revised: 11/14/2019] [Accepted: 12/15/2019] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Social media data are being increasingly used for population-level health research because it provides near real-time access to large volumes of consumer-generated data. Recently, a number of studies have explored the possibility of using social media data, such as from Twitter, for monitoring prescription medication abuse. However, there is a paucity of annotated data or guidelines for data characterization that discuss how information related to abuse-prone medications is presented on Twitter. OBJECTIVE This study discusses the creation of an annotated corpus suitable for training supervised classification algorithms for the automatic classification of medication abuse-related chatter. The annotation strategies used for improving interannotator agreement (IAA), a detailed annotation guideline, and machine learning experiments that illustrate the utility of the annotated corpus are also described. METHODS We employed an iterative annotation strategy, with interannotator discussions held and updates made to the annotation guidelines at each iteration to improve IAA for the manual annotation task. Using the grounded theory approach, we first characterized tweets into fine-grained categories and then grouped them into 4 broad classes-abuse or misuse, personal consumption, mention, and unrelated. After the completion of manual annotations, we experimented with several machine learning algorithms to illustrate the utility of the corpus and generate baseline performance metrics for automatic classification on these data. RESULTS Our final annotated set consisted of 16,443 tweets mentioning at least 20 abuse-prone medications including opioids, benzodiazepines, atypical antipsychotics, central nervous system stimulants, and gamma-aminobutyric acid analogs. Our final overall IAA was 0.86 (Cohen kappa), which represents high agreement. The manual annotation process revealed the variety of ways in which prescription medication misuse or abuse is discussed on Twitter, including expressions indicating coingestion, nonmedical use, nonstandard route of intake, and consumption above the prescribed doses. Among machine learning classifiers, support vector machines obtained the highest automatic classification accuracy of 73.00% (95% CI 71.4-74.5) over the test set (n=3271). CONCLUSIONS Our manual analysis and annotations of a large number of tweets have revealed types of information posted on Twitter about a set of abuse-prone prescription medications and their distributions. In the interests of reproducible and community-driven research, we have made our detailed annotation guidelines and the training data for the classification experiments publicly available, and the test data will be used in future shared tasks.
Collapse
Affiliation(s)
- Karen O'Connor
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Abeed Sarker
- Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta, GA, United States
| | - Jeanmarie Perrone
- Department of Emergency Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Graciela Gonzalez Hernandez
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
36
|
Weissenbacher D, Sarker A, Klein A, O’Connor K, Magge A, Gonzalez-Hernandez G. Deep neural networks ensemble for detecting medication mentions in tweets. J Am Med Inform Assoc 2019; 26:1618-1626. [PMID: 31562510 PMCID: PMC6857507 DOI: 10.1093/jamia/ocz156] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2019] [Revised: 07/26/2019] [Accepted: 08/13/2019] [Indexed: 11/12/2022] Open
Abstract
OBJECTIVE Twitter posts are now recognized as an important source of patient-generated data, providing unique insights into population health. A fundamental step toward incorporating Twitter data in pharmacoepidemiologic research is to automatically recognize medication mentions in tweets. Given that lexical searches for medication names suffer from low recall due to misspellings or ambiguity with common words, we propose a more advanced method to recognize them. MATERIALS AND METHODS We present Kusuri, an Ensemble Learning classifier able to identify tweets mentioning drug products and dietary supplements. Kusuri (, "medication" in Japanese) is composed of 2 modules: first, 4 different classifiers (lexicon based, spelling variant based, pattern based, and a weakly trained neural network) are applied in parallel to discover tweets potentially containing medication names; second, an ensemble of deep neural networks encoding morphological, semantic, and long-range dependencies of important words in the tweets makes the final decision. RESULTS On a class-balanced (50-50) corpus of 15 005 tweets, Kusuri demonstrated performances close to human annotators with an F1 score of 93.7%, the best score achieved thus far on this corpus. On a corpus made of all tweets posted by 112 Twitter users (98 959 tweets, with only 0.26% mentioning medications), Kusuri obtained an F1 score of 78.8%. To the best of our knowledge, Kusuri is the first system to achieve this score on such an extremely imbalanced dataset. CONCLUSIONS The system identifies tweets mentioning drug names with performance high enough to ensure its usefulness, and is ready to be integrated in pharmacovigilance, toxicovigilance, or more generally, public health pipelines that depend on medication name mentions.
Collapse
Affiliation(s)
- Davy Weissenbacher
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Abeed Sarker
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Ari Klein
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Karen O’Connor
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Arjun Magge
- Biodesign Center for Environmental Health Engineering, Biodesign Institute, Arizona State University, Tempe, Arizona, USA
| | - Graciela Gonzalez-Hernandez
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
37
|
Sarker A, Gonzalez-Hernandez G, Ruan Y, Perrone J. Machine Learning and Natural Language Processing for Geolocation-Centric Monitoring and Characterization of Opioid-Related Social Media Chatter. JAMA Netw Open 2019; 2:e1914672. [PMID: 31693125 PMCID: PMC6865282 DOI: 10.1001/jamanetworkopen.2019.14672] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
IMPORTANCE Automatic curation of consumer-generated, opioid-related social media big data may enable real-time monitoring of the opioid epidemic in the United States. OBJECTIVE To develop and validate an automatic text-processing pipeline for geospatial and temporal analysis of opioid-mentioning social media chatter. DESIGN, SETTING, AND PARTICIPANTS This cross-sectional, population-based study was conducted from December 1, 2017, to August 31, 2019, and used more than 3 years of publicly available social media posts on Twitter, dated from January 1, 2012, to October 31, 2015, that were geolocated in Pennsylvania. Opioid-mentioning tweets were extracted using prescription and illicit opioid names, including street names and misspellings. Social media posts (tweets) (n = 9006) were manually categorized into 4 classes, and training and evaluation of several machine learning algorithms were performed. Temporal and geospatial patterns were analyzed with the best-performing classifier on unlabeled data. MAIN OUTCOMES AND MEASURES Pearson and Spearman correlations of county- and substate-level abuse-indicating tweet rates with opioid overdose death rates from the Centers for Disease Control and Prevention WONDER database and with 4 metrics from the National Survey on Drug Use and Health for 3 years were calculated. Classifier performances were measured through microaveraged F1 scores (harmonic mean of precision and recall) or accuracies and 95% CIs. RESULTS A total of 9006 social media posts were annotated, of which 1748 (19.4%) were related to abuse, 2001 (22.2%) were related to information, 4830 (53.6%) were unrelated, and 427 (4.7%) were not in the English language. Yearly rates of abuse-indicating social media post showed statistically significant correlation with county-level opioid-related overdose death rates (n = 75) for 3 years (Pearson r = 0.451, P < .001; Spearman r = 0.331, P = .004). Abuse-indicating tweet rates showed consistent correlations with 4 NSDUH metrics (n = 13) associated with nonmedical prescription opioid use (Pearson r = 0.683, P = .01; Spearman r = 0.346, P = .25), illicit drug use (Pearson r = 0.850, P < .001; Spearman r = 0.341, P = .25), illicit drug dependence (Pearson r = 0.937, P < .001; Spearman r = 0.495, P = .09), and illicit drug dependence or abuse (Pearson r = 0.935, P < .001; Spearman r = 0.401, P = .17) over the same 3-year period, although the tests lacked power to demonstrate statistical significance. A classification approach involving an ensemble of classifiers produced the best performance in accuracy or microaveraged F1 score (0.726; 95% CI, 0.708-0.743). CONCLUSIONS AND RELEVANCE The correlations obtained in this study suggest that a social media-based approach reliant on supervised machine learning may be suitable for geolocation-centric monitoring of the US opioid epidemic in near real time.
Collapse
Affiliation(s)
- Abeed Sarker
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia
- Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta, Georgia
| | - Graciela Gonzalez-Hernandez
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia
| | - Yucheng Ruan
- School of Engineering and Applied Science, University of Pennsylvania, Philadelphia
| | - Jeanmarie Perrone
- Department of Emergency Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia
| |
Collapse
|
38
|
Klein AZ, Sarker A, Weissenbacher D, Gonzalez-Hernandez G. Towards scaling Twitter for digital epidemiology of birth defects. NPJ Digit Med 2019; 2:96. [PMID: 31583284 PMCID: PMC6773753 DOI: 10.1038/s41746-019-0170-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2019] [Accepted: 08/12/2019] [Indexed: 11/13/2022] Open
Abstract
Social media has recently been used to identify and study a small cohort of Twitter users whose pregnancies with birth defect outcomes-the leading cause of infant mortality-could be observed via their publicly available tweets. In this study, we exploit social media on a larger scale by developing natural language processing (NLP) methods to automatically detect, among thousands of users, a cohort of mothers reporting that their child has a birth defect. We used 22,999 annotated tweets to train and evaluate supervised machine learning algorithms-feature-engineered and deep learning-based classifiers-that automatically distinguish tweets referring to the user's pregnancy outcome from tweets that merely mention birth defects. Because 90% of the tweets merely mention birth defects, we experimented with under-sampling and over-sampling approaches to address this class imbalance. An SVM classifier achieved the best performance for the two positive classes: an F1-score of 0.65 for the "defect" class and 0.51 for the "possible defect" class. We deployed the classifier on 20,457 unlabeled tweets that mention birth defects, which helped identify 542 additional users for potential inclusion in our cohort. Contributions of this study include (1) NLP methods for automatically detecting tweets by users reporting their birth defect outcomes, (2) findings that an SVM classifier can outperform a deep neural network-based classifier for highly imbalanced social media data, (3) evidence that automatic classification can be used to identify additional users for potential inclusion in our cohort, and (4) a publicly available corpus for training and evaluating supervised machine learning algorithms.
Collapse
Affiliation(s)
- Ari Z. Klein
- Department of Biostatistics, Epidemiology, and Informatics Perelman School of Medicine University of Pennsylvania, Philadelphia, PA USA
| | - Abeed Sarker
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA USA
| | - Davy Weissenbacher
- Department of Biostatistics, Epidemiology, and Informatics Perelman School of Medicine University of Pennsylvania, Philadelphia, PA USA
| | - Graciela Gonzalez-Hernandez
- Department of Biostatistics, Epidemiology, and Informatics Perelman School of Medicine University of Pennsylvania, Philadelphia, PA USA
| |
Collapse
|
39
|
Data-Driven Lexical Normalization for Medical Social
Media. MULTIMODAL TECHNOLOGIES AND INTERACTION 2019. [DOI: 10.3390/mti3030060] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
In the medical domain, user-generated social media text is increasingly used as a valuablecomplementary knowledge source to scientific medical literature. The extraction of this knowledge iscomplicated by colloquial language use and misspellings. However, lexical normalization of suchdata has not been addressed effectively. This paper presents a data-driven lexical normalizationpipeline with a novel spelling correction module for medical social media. Our method significantlyoutperforms state-of-the-art spelling correction methods and can detect mistakes with an F1 of 0.63despite extreme imbalance in the data. We also present the first corpus for spelling mistake detectionand correction in a medical patient forum.
Collapse
|
40
|
Conway M, Hu M, Chapman WW. Recent Advances in Using Natural Language Processing to Address Public Health Research Questions Using Social Media and ConsumerGenerated Data. Yearb Med Inform 2019; 28:208-217. [PMID: 31419834 PMCID: PMC6697505 DOI: 10.1055/s-0039-1677918] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
OBJECTIVE We present a narrative review of recent work on the utilisation of Natural Language Processing (NLP) for the analysis of social media (including online health communities) specifically for public health applications. METHODS We conducted a literature review of NLP research that utilised social media or online consumer-generated text for public health applications, focussing on the years 2016 to 2018. Papers were identified in several ways, including PubMed searches and the inspection of recent conference proceedings from the Association of Computational Linguistics (ACL), the Conference on Human Factors in Computing Systems (CHI), and the International AAAI (Association for the Advancement of Artificial Intelligence) Conference on Web and Social Media (ICWSM). Popular data sources included Twitter, Reddit, various online health communities, and Facebook. RESULTS In the recent past, communicable diseases (e.g., influenza, dengue) have been the focus of much social media-based NLP health research. However, mental health and substance use and abuse (including the use of tobacco, alcohol, marijuana, and opioids) have been the subject of an increasing volume of research in the 2016 - 2018 period. Associated with this trend, the use of lexicon-based methods remains popular given the availability of psychologically validated lexical resources suitable for mental health and substance abuse research. Finally, we found that in the period under review "modern" machine learning methods (i.e. deep neural-network-based methods), while increasing in popularity, remain less widely used than "classical" machine learning methods.
Collapse
Affiliation(s)
- Mike Conway
- Department of Biomedical Informatics, University of Utah, Salt Lake City, Utah, United States
| | - Mengke Hu
- Department of Biomedical Informatics, University of Utah, Salt Lake City, Utah, United States
| | - Wendy W Chapman
- Department of Biomedical Informatics, University of Utah, Salt Lake City, Utah, United States
| |
Collapse
|
41
|
Natural language processing of Reddit data to evaluate dermatology patient experiences and therapeutics. J Am Acad Dermatol 2019; 83:803-808. [PMID: 31306722 DOI: 10.1016/j.jaad.2019.07.014] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Revised: 06/30/2019] [Accepted: 07/03/2019] [Indexed: 11/24/2022]
Abstract
BACKGROUND There is a lack of research studying patient-generated data on Reddit, one of the world's most popular forums with active users interested in dermatology. Techniques within natural language processing, a field of artificial intelligence, can analyze large amounts of text information and extract insights. OBJECTIVE To apply natural language processing to Reddit comments about dermatology topics to assess for feasibility and potential for insights and engagement. METHODS A software pipeline preprocessed Reddit comments from 2005 to 2017 from 7 popular dermatology-related subforums on Reddit, applied latent Dirichlet allocation, and used spectral clustering to establish cohesive themes and the frequency of word representation and grouped terms within these topics. RESULTS We created a corpus of 176,000 comments and identified trends in patient engagement in spaces such as eczema and acne, among others, with a focus on homeopathic treatments and isotretinoin. LIMITATIONS Latent Dirichlet allocation is an unsupervised model, meaning there is no ground truth to which the model output can be compared. However, because these forums are anonymous, there seems little incentive for patients to be dishonest. CONCLUSIONS Reddit data has viability and utility for dermatologic research and engagement with the public, especially for common dermatology topics such as tanning, acne, and psoriasis.
Collapse
|
42
|
Klein AZ, Sarker A, O'Connor K, Gonzalez-Hernandez G. An Analysis of a Twitter Corpus for Training a Medication Intake Classifier. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2019; 2019:102-106. [PMID: 31258961 PMCID: PMC6568126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
While social media has evolved into a useful resource for studying medication-related information, observational studies of medications have continued to rely on other sources of data. Towards advancing the use of social media data for medication-related observational studies, we analyze an annotated corpus of 27,941 tweets designed for training machine learning algorithms to automatically detect users' medication intake. In particular, we assess how a baseline classifier trained on the general corpus-that is, on various types of medication-performs for specific types. For most types, the classifier performs significantly better than it does overall; however, for nervous system medications, it performs significantly worse. These results suggest that, while the general corpus may have utility for observational studies focusing on most types of medication, studying nervous system medications may benefit from training a classifier exclusively for this type. We will explore this data-level approach in future work.
Collapse
Affiliation(s)
- Ari Z Klein
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Abeed Sarker
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Karen O'Connor
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Graciela Gonzalez-Hernandez
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|