Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Sampathkumar H, Chen XW, Luo B. Mining adverse drug reactions from online healthcare forums using hidden Markov model. BMC Med Inform Decis Mak 2014;14:91. [PMID: 25341686 PMCID: PMC4283122 DOI: 10.1186/1472-6947-14-91] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2013] [Accepted: 08/18/2014] [Indexed: 11/18/2022] Open

For:	Sampathkumar H, Chen XW, Luo B. Mining adverse drug reactions from online healthcare forums using hidden Markov model. BMC Med Inform Decis Mak 2014;14:91. [PMID: 25341686 PMCID: PMC4283122 DOI: 10.1186/1472-6947-14-91] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2013] [Accepted: 08/18/2014] [Indexed: 11/18/2022] Open

Number

Cited by Other Article(s)

Martenot V, Masdeu V, Cupe J, Gehin F, Blanchon M, Dauriat J, Horst A, Renaudin M, Girard P, Zucker JD. LiSA: an assisted literature search pipeline for detecting serious adverse drug events with deep learning. BMC Med Inform Decis Mak 2022;22:338. [PMID: 36550485 PMCID: PMC9773506 DOI: 10.1186/s12911-022-02085-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Accepted: 12/13/2022] [Indexed: 12/24/2022] Open

Abstract

INTRODUCTION

Detecting safety signals attributed to a drug in scientific literature is a fundamental issue in pharmacovigilance. The constant increase in the volume of publications requires the automation of this tedious task, in order to find and extract relevant articles from the pack. This task is critical, as serious Adverse Drug Reactions (ADRs) still account for a large number of hospital admissions each year.

OBJECTIVES

The aim of this study is to develop an augmented intelligence methodology for automatically identifying relevant publications mentioning an established link between a Drug and a Serious Adverse Event, according to the European Medicines Agency (EMA) definition of seriousness.

METHODS

The proposed pipeline, called LiSA (for Literature Search Application), is based on three independent deep learning models supporting a precise detection of safety signals in the biomedical literature. By combining a Bidirectional Encoder Representations from Transformers (BERT) algorithms and a modular architecture, the pipeline achieves a precision of 0.81 and a recall of 0.89 at sentences level in articles extracted from PubMed (either abstract or full-text). We also measured that by using LiSA, a medical reviewer increases by a factor of 2.5 the number of relevant documents it can collect and evaluate compared to a simple keyword search. In the interest of re-usability, emphasis was placed on building a modular pipeline allowing the insertion of other NLP modules to enrich the results provided by the system, and extend it to other use cases. In addition, a lightweight visualization tool was developed to analyze and monitor safety signal results.

CONCLUSIONS

Overall, the generic pipeline and the visualization tool proposed in this article allows for efficient and accurate monitoring of serious adverse drug reactions from the literature and can easily be adapted to similar pharmacovigilance use cases. To facilitate reproducibility and benefit other research studies, we also shared a first benchmark dataset for Serious Adverse Drug Events detection.

Collapse

Quazi S. Artificial intelligence and machine learning in precision and genomic medicine. Med Oncol 2022;39:120. [PMID: 35704152 PMCID: PMC9198206 DOI: 10.1007/s12032-022-01711-1] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Accepted: 03/14/2022] [Indexed: 10/28/2022]

Quazi S. Artificial intelligence and machine learning in precision and genomic medicine. Med Oncol 2022;39:120. [PMID: 35704152 PMCID: PMC9198206 DOI: 10.1007/s12032-022-01711-1;lastaccessedondecember18,2022at1730hrs] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/28/2023]

Huang JY, Lee WP, Lee KD. Predicting Adverse Drug Reactions from Social Media Posts: Data Balance, Feature Selection and Deep Learning. Healthcare (Basel) 2022;10:healthcare10040618. [PMID: 35455795 PMCID: PMC9024774 DOI: 10.3390/healthcare10040618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2022] [Revised: 03/22/2022] [Accepted: 03/23/2022] [Indexed: 11/16/2022] Open

Using Machine Learning for Pharmacovigilance: A Systematic Review. Pharmaceutics 2022;14:pharmaceutics14020266. [PMID: 35213998 PMCID: PMC8924891 DOI: 10.3390/pharmaceutics14020266] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 01/13/2022] [Accepted: 01/21/2022] [Indexed: 02/04/2023] Open

Alarifi M, Jabour A, Foy DM, Zolnoori M. Identifying the underlying factors associated with antidepressant drug discontinuation: content analysis of patients' drug reviews. Inform Health Soc Care 2022;47:414-423. [PMID: 35050827 DOI: 10.1080/17538157.2021.2024835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Gattepaille LM, Hedfors Vidlin S, Bergvall T, Pierce CE, Ellenius J. Prospective Evaluation of Adverse Event Recognition Systems in Twitter: Results from the Web-RADR Project. Drug Saf 2021;43:797-808. [PMID: 32410156 PMCID: PMC7395913 DOI: 10.1007/s40264-020-00942-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Abstract

Introduction

A large number of studies on systems to detect and sometimes normalize adverse events (AEs) in social media have been published, but evidence of their practical utility is scarce. This raises the question of the transferability of such systems to new settings.

Objectives

The aims of this study were to develop an AE recognition system, prospectively evaluate its performance on an external benchmark dataset and identify potential factors influencing the transferability of AE recognition systems.

Methods

A pipeline based on dictionary lookups and logistic regression classifiers was developed using a proprietary dataset of 196,533 Tweets manually annotated for AE relations and prospectively evaluated the system on the publicly available WEB-RADR reference dataset, exploring different aspects affecting transferability.

Results

Our system achieved 0.53 precision, 0.52 recall and 0.52 F1-score on the development test set; however, when applied to the WEB-RADR reference dataset, system performance dropped to 0.38 precision, 0.20 recall and 0.26 F1-score. Similarly, a previously published method aiming at automatically detecting adverse event posts reported 0.5 precision, 0.92 recall and 0.65 F1-score on thus another dataset, while performance on the WEB-RADR reference dataset was reduced to 0.37 precision, 0.63 recall and 0.46 F1-score. We identified four potential factors leading to poor transferability: overfitting, selection bias, label bias and prevalence.

Conclusion

We warn the community about a potentially large discrepancy between the expected performance of automated AE recognition systems based on published results and the actual observed performance on independent data. This study highlights the difficulty of implementing an all-purpose system for automatic adverse event recognition in Twitter, which could explain the lack of such systems in practical pharmacovigilance settings. Our recommendation is to use benchmark independent datasets, such as the WEB-RADR reference, to investigate the transferability of the adverse event recognition systems and ultimately enforce rigorous comparisons across studies on the task.

Electronic supplementary material

The online version of this article (10.1007/s40264-020-00942-3) contains supplementary material, which is available to authorized users.

Collapse

Cheerkoot-Jalim S, Khedo KK. A systematic review of text mining approaches applied to various application areas in the biomedical domain. JOURNAL OF KNOWLEDGE MANAGEMENT 2020. [DOI: 10.1108/jkm-09-2019-0524] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

Abstract Purpose This work shows the results of a systematic literature review on biomedical text mining. The purpose of this study is to identify the different text mining approaches used in different application areas of the biomedical domain, the common tools used and the challenges of biomedical text mining as compared to generic text mining algorithms. This study will be of value to biomedical researchers by allowing them to correlate text mining approaches to specific biomedical application areas. Implications for future research are also discussed. Design/methodology/approach The review was conducted following the principles of the Kitchenham method. A number of research questions were first formulated, followed by the definition of the search strategy. The papers were then selected based on a list of assessment criteria. Each of the papers were analyzed and information relevant to the research questions were extracted. Findings It was found that researchers have mostly harnessed data sources such as electronic health records, biomedical literature, social media and health-related forums. The most common text mining technique was natural language processing using tools such as MetaMap and Unstructured Information Management Architecture, alongside the use of medical terminologies such as Unified Medical Language System. The main application area was the detection of adverse drug events. Challenges identified included the need to deal with huge amounts of text, the heterogeneity of the different data sources, the duality of meaning of words in biomedical text and the amount of noise introduced mainly from social media and health-related forums. Originality/value To the best of the authors’ knowledge, other reviews in this area have focused on either specific techniques, specific application areas or specific data sources. The results of this review will help researchers to correlate most relevant and recent advances in text mining approaches to specific biomedical application areas by providing an up-to-date and holistic view of work done in this research area. The use of emerging text mining techniques has great potential to spur the development of innovative applications, thus considerably impacting on the advancement of biomedical research. Collapse

Gujral H, Kushwaha AK, Khurana S. Utilization of Time Series Tools in Life-sciences and Neuroscience. Neurosci Insights 2020;15:2633105520963045. [PMID: 33345189 PMCID: PMC7727047 DOI: 10.1177/2633105520963045] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Accepted: 09/11/2020] [Indexed: 01/18/2023] Open

Learning structured medical information from social media. J Biomed Inform 2020;110:103568. [PMID: 32942027 DOI: 10.1016/j.jbi.2020.103568] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Revised: 08/21/2020] [Accepted: 09/12/2020] [Indexed: 11/21/2022]

Nguyen VH, Sugiyama K, Kan MY, Halder K. Neural side effect discovery from user credibility and experience-assessed online health discussions. J Biomed Semantics 2020;11:5. [PMID: 32641159 PMCID: PMC7341623 DOI: 10.1186/s13326-020-00221-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2019] [Accepted: 06/07/2020] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Health 2.0 allows patients and caregivers to conveniently seek medical information and advice via e-portals and online discussion forums, especially regarding potential drug side effects. Although online health communities are helpful platforms for obtaining non-professional opinions, they pose risks in communicating unreliable and insufficient information in terms of quality and quantity. Existing methods in extracting user-reported adverse drug reactions (ADRs) in online health forums are not only insufficiently accurate as they disregard user credibility and drug experience, but are also expensive as they rely on supervised ground truth annotation of individual statement. We propose a NEural ArchiTecture for Drug side effect prediction (NEAT), which is optimized on the task of drug side effect discovery based on a complete discussion while being attentive to user credibility and experience, thus, addressing the mentioned shortcomings. We train our neural model in a self-supervised fashion using ground truth drug side effects from mayoclinic.org. NEAT learns to assign each user a score that is descriptive of their credibility and highlights the critical textual segments of their post.

RESULTS

Experiments show that NEAT improves drug side effect discovery from online health discussion by 3.04% from user-credibility agnostic baselines, and by 9.94% from non-neural baselines in term of F1. Additionally, the latent credibility scores learned by the model correlate well with trustworthiness signals, such as the number of "thanks" received by other forum members, and improve credibility heuristics such as number of posts by 0.113 in term of Spearman's rank correlation coefficient. Experience-based self-supervised attention highlights critical phrases such as mentioned side effects, and enhances fully supervised ADR extraction models based on sequence labelling by 5.502% in terms of precision.

CONCLUSIONS

NEAT considers both user credibility and experience in online health forums, making feasible a self-supervised approach to side effect prediction for mentioned drugs. The derived user credibility and attention mechanism are transferable and improve downstream ADR extraction models. Our approach enhances automatic drug side effect discovery and fosters research in several domains including pharmacovigilance and clinical studies.

Collapse

Spiro A, Fernández García J, Yanover C. Inferring new relations between medical entities using literature curated term co-occurrences. JAMIA Open 2020;2:378-385. [PMID: 31984370 PMCID: PMC6951958 DOI: 10.1093/jamiaopen/ooz022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 06/05/2019] [Accepted: 06/08/2019] [Indexed: 11/17/2022] Open

Abstract

Objectives

Identifying new relations between medical entities, such as drugs, diseases, and side effects, is typically a resource-intensive task, involving experimentation and clinical trials. The increased availability of related data and curated knowledge enables a computational approach to this task, notably by training models to predict likely relations. Such models rely on meaningful representations of the medical entities being studied. We propose a generic features vector representation that leverages co-occurrences of medical terms, linked with PubMed citations.

Materials and Methods

We demonstrate the usefulness of the proposed representation by inferring two types of relations: a drug causes a side effect and a drug treats an indication. To predict these relations and assess their effectiveness, we applied 2 modeling approaches: multi-task modeling using neural networks and single-task modeling based on gradient boosting machines and logistic regression.

Results

These trained models, which predict either side effects or indications, obtained significantly better results than baseline models that use a single direct co-occurrence feature. The results demonstrate the advantage of a comprehensive representation.

Discussion

Selecting the appropriate representation has an immense impact on the predictive performance of machine learning models. Our proposed representation is powerful, as it spans multiple medical domains and can be used to predict a wide range of relation types.

Conclusion

The discovery of new relations between various medical entities can be translated into meaningful insights, for example, related to drug development or disease understanding. Our representation of medical entities can be used to train models that predict such relations, thus accelerating healthcare-related discoveries.

Collapse

Ahmed Z, Mohamed K, Zeeshan S, Dong X. Artificial intelligence with multi-functional machine learning platform development for better healthcare and precision medicine. Database (Oxford) 2020;2020:baaa010. [PMID: 32185396 PMCID: PMC7078068 DOI: 10.1093/database/baaa010] [Citation(s) in RCA: 151] [Impact Index Per Article: 37.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2019] [Revised: 01/05/2020] [Accepted: 01/21/2020] [Indexed: 02/06/2023]

Abstract

Precision medicine is one of the recent and powerful developments in medical care, which has the potential to improve the traditional symptom-driven practice of medicine, allowing earlier interventions using advanced diagnostics and tailoring better and economically personalized treatments. Identifying the best pathway to personalized and population medicine involves the ability to analyze comprehensive patient information together with broader aspects to monitor and distinguish between sick and relatively healthy people, which will lead to a better understanding of biological indicators that can signal shifts in health. While the complexities of disease at the individual level have made it difficult to utilize healthcare information in clinical decision-making, some of the existing constraints have been greatly minimized by technological advancements. To implement effective precision medicine with enhanced ability to positively impact patient outcomes and provide real-time decision support, it is important to harness the power of electronic health records by integrating disparate data sources and discovering patient-specific patterns of disease progression. Useful analytic tools, technologies, databases, and approaches are required to augment networking and interoperability of clinical, laboratory and public health systems, as well as addressing ethical and social issues related to the privacy and protection of healthcare data with effective balance. Developing multifunctional machine learning platforms for clinical data extraction, aggregation, management and analysis can support clinicians by efficiently stratifying subjects to understand specific scenarios and optimize decision-making. Implementation of artificial intelligence in healthcare is a compelling vision that has the potential in leading to the significant improvements for achieving the goals of providing real-time, better personalized and population medicine at lower costs. In this study, we focused on analyzing and discussing various published artificial intelligence and machine learning solutions, approaches and perspectives, aiming to advance academic solutions in paving the way for a new data-centric era of discovery in healthcare.

Collapse

Wunnava S, Qin X, Kakar T, Sen C, Rundensteiner EA, Kong X. Adverse Drug Event Detection from Electronic Health Records Using Hierarchical Recurrent Neural Networks with Dual-Level Embedding. Drug Saf 2019;42:113-122. [PMID: 30649736 DOI: 10.1007/s40264-018-0765-9] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]

Abstract

INTRODUCTION

Adverse drug event (ADE) detection is a vital step towards effective pharmacovigilance and prevention of future incidents caused by potentially harmful ADEs. The electronic health records (EHRs) of patients in hospitals contain valuable information regarding ADEs and hence are an important source for detecting ADE signals. However, EHR texts tend to be noisy. Yet applying off-the-shelf tools for EHR text preprocessing jeopardizes the subsequent ADE detection performance, which depends on a well tokenized text input.

OBJECTIVE

In this paper, we report our experience with the NLP Challenges for Detecting Medication and Adverse Drug Events from Electronic Health Records (MADE1.0), which aims to promote deep innovations on this subject. In particular, we have developed rule-based sentence and word tokenization techniques to deal with the noise in the EHR text.

METHODS

We propose a detection methodology by adapting a three-layered, deep learning architecture of (1) recurrent neural network [bi-directional long short-term memory (Bi-LSTM)] for character-level word representation to encode the morphological features of the medical terminology, (2) Bi-LSTM for capturing the contextual information of each word within a sentence, and (3) conditional random fields for the final label prediction by also considering the surrounding words. We experiment with different word embedding methods commonly used in word-level classification tasks and demonstrate the impact of an integrated usage of both domain-specific and general-purpose pre-trained word embedding for detecting ADEs from EHRs.

RESULTS

Our system was ranked first for the named entity recognition task in the MADE1.0 challenge, with a micro-averaged F1-score of 0.8290 (official score).

CONCLUSION

Our results indicate that the integration of two widely used sequence labeling techniques that complement each other along with dual-level embedding (character level and word level) to represent words in the input layer results in a deep learning architecture that achieves excellent information extraction accuracy for EHR notes.

Collapse

Arnoux-Guenegou A, Girardeau Y, Chen X, Deldossi M, Aboukhamis R, Faviez C, Dahamna B, Karapetiantz P, Guillemin-Lanne S, Lillo-Le Louët A, Texier N, Burgun A, Katsahian S. The Adverse Drug Reactions From Patient Reports in Social Media Project: Protocol for an Evaluation Against a Gold Standard. JMIR Res Protoc 2019;8:e11448. [PMID: 31066711 PMCID: PMC6528435 DOI: 10.2196/11448] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Revised: 11/16/2018] [Accepted: 12/21/2018] [Indexed: 12/30/2022] Open

Abstract

Background

Social media is a potential source of information on postmarketing drug safety surveillance that still remains unexploited nowadays. Information technology solutions aiming at extracting adverse reactions (ADRs) from posts on health forums require a rigorous evaluation methodology if their results are to be used to make decisions. First, a gold standard, consisting of manual annotations of the ADR by human experts from the corpus extracted from social media, must be implemented and its quality must be assessed. Second, as for clinical research protocols, the sample size must rely on statistical arguments. Finally, the extraction methods must target the relation between the drug and the disease (which might be either treated or caused by the drug) rather than simple co-occurrences in the posts.

Objective

We propose a standardized protocol for the evaluation of a software extracting ADRs from the messages on health forums. The study is conducted as part of the Adverse Drug Reactions from Patient Reports in Social Media project.

Methods

Messages from French health forums were extracted. Entity recognition was based on Racine Pharma lexicon for drugs and Medical Dictionary for Regulatory Activities terminology for potential adverse events (AEs). Natural language processing–based techniques automated the ADR information extraction (relation between the drug and AE entities). The corpus of evaluation was a random sample of the messages containing drugs and/or AE concepts corresponding to recent pharmacovigilance alerts. A total of 2 persons experienced in medical terminology manually annotated the corpus, thus creating the gold standard, according to an annotator guideline. We will evaluate our tool against the gold standard with recall, precision, and f-measure. Interannotator agreement, reflecting gold standard quality, will be evaluated with hierarchical kappa. Granularities in the terminologies will be further explored.

Results

Necessary and sufficient sample size was calculated to ensure statistical confidence in the assessed results. As we expected a global recall of 0.5, we needed at least 384 identified ADR concepts to obtain a 95% CI with a total width of 0.10 around 0.5. The automated ADR information extraction in the corpus for evaluation is already finished. The 2 annotators already completed the annotation process. The analysis of the performance of the ADR information extraction module as compared with gold standard is ongoing.

Conclusions

This protocol is based on the standardized statistical methods from clinical research to create the corpus, thus ensuring the necessary statistical power of the assessed results. Such evaluation methodology is required to make the ADR information extraction software useful for postmarketing drug safety surveillance.

International Registered Report Identifier (IRRID)

RR1-10.2196/11448

Collapse

Affiliation(s)

Armelle Arnoux-Guenegou INSERM U1138 - Team 22, Information Sciences to Support Personalized Medicine, Centre de Recherche des Cordeliers, Paris, France
Yannick Girardeau INSERM U1138 - Team 22, Information Sciences to Support Personalized Medicine, Centre de Recherche des Cordeliers, Paris, France.,Département d'Informatique Médicale, Hôpital Européen Georges-Pompidou, Assistance Publique - Hôpitaux de Paris, Paris, France
Xiaoyi Chen INSERM U1138 - Team 22, Information Sciences to Support Personalized Medicine, Centre de Recherche des Cordeliers, Paris, France
Myrtille Deldossi Innovative Projects - Text Mining, Expert System, Paris, France
Rim Aboukhamis Centre Régional de Pharmacovigilance, Hôpital Européen Georges-Pompidou, Assistance Publique - Hôpitaux de Paris, Paris, France
Carole Faviez Kappa Santé, Paris, France
Badisse Dahamna Service d'Informatique Biomédicale, D2IM, Centre Hospitalier Universitaire de Rouen, Rouen, France
Pierre Karapetiantz INSERM U1138 - Team 22, Information Sciences to Support Personalized Medicine, Centre de Recherche des Cordeliers, Paris, France
Sylvie Guillemin-Lanne Innovative Projects - Text Mining, Expert System, Paris, France
Agnès Lillo-Le Louët Centre Régional de Pharmacovigilance, Hôpital Européen Georges-Pompidou, Assistance Publique - Hôpitaux de Paris, Paris, France
Nathalie Texier Kappa Santé, Paris, France
Anita Burgun INSERM U1138 - Team 22, Information Sciences to Support Personalized Medicine, Centre de Recherche des Cordeliers, Paris, France.,Département d'Informatique Médicale, Hôpital Européen Georges-Pompidou, Assistance Publique - Hôpitaux de Paris, Paris, France.,INSERM U1138 - Team 22, Information Sciences to Support Personalized Medicine, Paris Descartes University, Sorbonne Paris Cité, Paris, France
Sandrine Katsahian INSERM U1138 - Team 22, Information Sciences to Support Personalized Medicine, Centre de Recherche des Cordeliers, Paris, France.,INSERM U1138 - Team 22, Information Sciences to Support Personalized Medicine, Paris Descartes University, Sorbonne Paris Cité, Paris, France.,Clinical Research Unit Hôpitaux Universitaires Paris Ouest, Hôpital Européen Georges-Pompidou, Assistance Publique - Hôpitaux de Paris, Paris, France.,INSERM CIC1418, Clinical Epidemiology, Hôpital Européen Georges-Pompidou, Paris, France

Collapse

Sotoodeh M, Ho JC. Improving length of stay prediction using a hidden Markov model. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2019;2019:425-434. [PMID: 31258996 PMCID: PMC6568102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Zhang M, Zhang M, Ge C, Liu Q, Wang J, Wei J, Zhu KQ. Automatic discovery of adverse reactions through Chinese social media. Data Min Knowl Discov 2019. [DOI: 10.1007/s10618-018-00610-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]

Harnessing social media data for pharmacovigilance: a review of current state of the art, challenges and future directions. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS 2019. [DOI: 10.1007/s41060-019-00175-3] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]

Hoang T, Liu J, Pratt N, Zheng VW, Chang KC, Roughead E, Li J. Authenticity and credibility aware detection of adverse drug events from social media. Int J Med Inform 2018;120:157-171. [PMID: 30409341 DOI: 10.1016/j.ijmedinf.2018.10.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2017] [Revised: 09/11/2018] [Accepted: 10/09/2018] [Indexed: 11/16/2022]

Abstract

OBJECTIVES

Adverse drug events (ADEs) are among the top causes of hospitalization and death. Social media is a promising open data source for the timely detection of potential ADEs. In this paper, we study the problem of detecting signals of ADEs from social media.

METHODS

Detecting ADEs whose drug and AE may be reported in different posts of a user leads to major concerns regarding the content authenticity and user credibility, which have not been addressed in previous studies. Content authenticity concerns whether a post mentions drugs or adverse events that are actually consumed or experienced by the writer. User credibility indicates the degree to which chronological evidence from a user's sequence of posts should be trusted in the ADE detection. We propose AC-SPASM, a Bayesian model for the authenticity and credibility aware detection of ADEs from social media. The model exploits the interaction between content authenticity, user credibility and ADE signal quality. In particular, we argue that the credibility of a user correlates with the user's consistency in reporting authentic content.

RESULTS

We conduct experiments on a real-world Twitter dataset containing 1.2 million posts from 13,178 users. Our benchmark set contains 22 drugs and 8089 AEs. AC-SPASM recognizes authentic posts with F₁ - the harmonic mean of precision and recall of 80%, and estimates user credibility with precision@10 = 90% and NDCG@10 - a measure for top-10 ranking quality of 96%. Upon validation against known ADEs, AC-SPASM achieves F₁ = 91%, outperforming state-of-the-art baseline models by 32% (p < 0.05). Also, AC-SPASM obtains precision@456 = 73% and NDCG@456 = 94% in detecting and prioritizing unknown potential ADE signals for further investigation. Furthermore, the results show that AC-SPASM is scalable to large datasets.

CONCLUSIONS

Our study demonstrates that taking into account the content authenticity and user credibility improves the detection of ADEs from social media. Our work generates hypotheses to reduce experts' guesswork in identifying unknown potential ADEs.

Collapse

Hoang T, Liu J, Pratt N, Zheng VW, Chang KC, Roughead E, Li J. Authenticity and credibility aware detection of adverse drug events from social media. Int J Med Inform 2018;120:101-115. [PMID: 30409335 DOI: 10.1016/j.ijmedinf.2018.09.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2017] [Accepted: 09/03/2018] [Indexed: 11/29/2022]

Abstract

OBJECTIVES

METHODS

RESULTS

We conduct experiments on a real-world Twitter dataset containing 1.2 million posts from 13,178 users. Our benchmark set contains 22 drugs and 8089 AEs. AC-SPASM recognizes authentic posts with F₁ - the harmonic mean of precision and recall of 80%, and estimates user credibility with precision@10 = 90% and NDCG@10 - a measure for top-10 ranking quality of 96%. Upon validation against known ADEs, AC-SPASM achieves F₁ = 91%, outperforming state-of-the-art baseline models by 32% (p < 0.05). Also, AC-SPASM obtains precision@456 = 73% and NDCG@456 = 94% in detecting and prioritizing unknown potential ADE signals for further investigation. Furthermore, the results show that AC-SPASM is scalable to large datasets.

CONCLUSIONS

Collapse

Convertino I, Ferraro S, Blandizzi C, Tuccori M. The usefulness of listening social media for pharmacovigilance purposes: a systematic review. Expert Opin Drug Saf 2018;17:1081-1093. [DOI: 10.1080/14740338.2018.1531847] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]

Tricco AC, Zarin W, Lillie E, Jeblee S, Warren R, Khan PA, Robson R, Pham B, Hirst G, Straus SE. Utility of social media and crowd-intelligence data for pharmacovigilance: a scoping review. BMC Med Inform Decis Mak 2018;18:38. [PMID: 29898743 PMCID: PMC6001022 DOI: 10.1186/s12911-018-0621-y] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2017] [Accepted: 05/31/2018] [Indexed: 12/15/2022] Open

Abstract

BACKGROUND

A scoping review to characterize the literature on the use of conversations in social media as a potential source of data for detecting adverse events (AEs) related to health products.

METHODS

Our specific research questions were (1) What social media listening platforms exist to detect adverse events related to health products, and what are their capabilities and characteristics? (2) What is the validity and reliability of data from social media for detecting these adverse events? MEDLINE, EMBASE, Cochrane Library, and relevant websites were searched from inception to May 2016. Any type of document (e.g., manuscripts, reports) that described the use of social media data for detecting health product AEs was included. Two reviewers independently screened citations and full-texts, and one reviewer and one verifier performed data abstraction. Descriptive synthesis was conducted.

RESULTS

After screening 3631 citations and 321 full-texts, 70 unique documents with 7 companion reports available from 2001 to 2016 were included. Forty-six documents (66%) described an automated or semi-automated information extraction system to detect health product AEs from social media conversations (in the developmental phase). Seven pre-existing information extraction systems to mine social media data were identified in eight documents. Nineteen documents compared AEs reported in social media data with validated data and found consistent AE discovery in all except two documents. None of the documents reported the validity and reliability of the overall system, but some reported on the performance of individual steps in processing the data. The validity and reliability results were found for the following steps in the data processing pipeline: data de-identification (n = 1), concept identification (n = 3), concept normalization (n = 2), and relation extraction (n = 8). The methods varied widely, and some approaches yielded better results than others.

CONCLUSIONS

Our results suggest that the use of social media conversations for pharmacovigilance is in its infancy. Although social media data has the potential to supplement data from regulatory agency databases; is able to capture less frequently reported AEs; and can identify AEs earlier than official alerts or regulatory changes, the utility and validity of the data source remains under-studied.

TRIAL REGISTRATION

Open Science Framework ( https://osf.io/kv9hu/ ).

Collapse

Karapetiantz P, Bellet F, Audeh B, Lardon J, Leprovost D, Aboukhamis R, Morlane-Hondère F, Grouin C, Burgun A, Katsahian S, Jaulent MC, Beyens MN, Lillo-Le Louët A, Bousquet C. Descriptions of Adverse Drug Reactions Are Less Informative in Forums Than in the French Pharmacovigilance Database but Provide More Unexpected Reactions. Front Pharmacol 2018;9:439. [PMID: 29765326 PMCID: PMC5938397 DOI: 10.3389/fphar.2018.00439] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Accepted: 04/13/2018] [Indexed: 01/28/2023] Open

Abstract

Background: Social media have drawn attention for their potential use in Pharmacovigilance. Recent work showed that it is possible to extract information concerning adverse drug reactions (ADRs) from posts in social media. The main objective of the Vigi4MED project was to evaluate the relevance and quality of the information shared by patients on web forums about drug safety and its potential utility for pharmacovigilance.

Methods: After selecting websites of interest, we manually evaluated the relevance of the content of posts for pharmacovigilance related to six drugs (agomelatine, baclofen, duloxetine, exenatide, strontium ranelate, and tetrazepam). We compared forums to the French Pharmacovigilance Database (FPVD) to (1) evaluate whether they contained relevant information to characterize a pharmacovigilance case report (patient’s age and sex; treatment indication, dose and duration; time-to-onset (TTO) and outcome of the ADR, and drug dechallenge and rechallenge) and (2) perform impact analysis (nature, seriousness, unexpectedness, and outcome of the ADR).

Results: The cases in the FPVD were significantly more informative than posts in forums for patient description (age, sex), treatment description (dose, duration, TTO), and outcome of the ADR, but the indication for the treatment was more often found in forums. Cases were more often serious in the FPVD than in forums (46% vs. 4%), but forums more often contained an unexpected ADR than the FPVD (24% vs. 17%). Moreover, 197 unexpected ADRs identified in forums were absent from the FPVD and the distribution of the MedDRA System Organ Classes (SOCs) was different between the two data sources.

Discussion: This study is the first to evaluate if patients’ posts may qualify as potential and informative case reports that should be stored in a pharmacovigilance database in the same way as case reports submitted by health professionals. The posts were less informative (except for the indication) and focused on less serious ADRs than the FPVD cases, but more unexpected ADRs were presented in forums than in the FPVD and their SOCs were different. Thus, web forums should be considered as a secondary, but complementary source for pharmacovigilance.

Collapse

Affiliation(s)

Pierre Karapetiantz Sorbonne Université, INSERM, Université Paris 13, Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances en e-Santé, Paris, France
Florelle Bellet Centre Régional de Pharmacovigilance, Centre Hospitalier Universitaire de Saint-Étienne, Hôpital Nord, Saint-Étienne, France
Bissan Audeh Université de Lyon, IMT Mines Saint-Etienne, Institut Henri Fayol, Département ISI, Université Jean Monnet, Institut d'Optique Graduate School, Centre National de la Recherche Scientifique, Laboratoire Hubert Curien, Saint-Étienne, France
Jérémy Lardon Sorbonne Université, INSERM, Université Paris 13, Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances en e-Santé, Paris, France
Damien Leprovost Sorbonne Université, INSERM, Université Paris 13, Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances en e-Santé, Paris, France
Rim Aboukhamis Centre Régional de Pharmacovigilance, Hôpital Européen Georges-Pompidou, Assistance Publique - Hôpitaux de Paris, Paris, France
François Morlane-Hondère LIMSI, CNRS, Université Paris-Saclay, Orsay, France
Cyril Grouin LIMSI, CNRS, Université Paris-Saclay, Orsay, France
Anita Burgun INSERM UMRS1138 Centre de Recherche des Cordeliers, Paris, France.,Département d'Informatique Médicale, Hôpital Européen Georges-Pompidou, Assistance Publique - Hôpitaux de Paris, Paris, France
Sandrine Katsahian INSERM UMRS1138 Centre de Recherche des Cordeliers, Paris, France.,Département d'Informatique Médicale, Hôpital Européen Georges-Pompidou, Assistance Publique - Hôpitaux de Paris, Paris, France
Marie-Christine Jaulent Sorbonne Université, INSERM, Université Paris 13, Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances en e-Santé, Paris, France
Marie-Noëlle Beyens Centre Régional de Pharmacovigilance, Centre Hospitalier Universitaire de Saint-Étienne, Hôpital Nord, Saint-Étienne, France
Agnès Lillo-Le Louët Centre Régional de Pharmacovigilance, Hôpital Européen Georges-Pompidou, Assistance Publique - Hôpitaux de Paris, Paris, France
Cédric Bousquet Sorbonne Université, INSERM, Université Paris 13, Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances en e-Santé, Paris, France

Collapse

Beyond vector space model for hierarchical Arabic text classification: A Markov chain approach. Inf Process Manag 2018. [DOI: 10.1016/j.ipm.2017.10.003] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

SSEL-ADE: A semi-supervised ensemble learning framework for extracting adverse drug events from social media. Artif Intell Med 2017;84:34-49. [PMID: 29111222 DOI: 10.1016/j.artmed.2017.10.003] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Revised: 08/28/2017] [Accepted: 10/15/2017] [Indexed: 11/21/2022]

Al-Thuhli A, Al-Badawi M, Baghdadi Y, Al-Hamdani A. A Framework for Interfacing Unstructured Data Into Business Process From Enterprise Social Networks. INTERNATIONAL JOURNAL OF ENTERPRISE INFORMATION SYSTEMS 2017. [DOI: 10.4018/ijeis.2017100102] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Liu Y, Shi J, Chen Y. Patient-centered and experience-aware mining for effective adverse drug reaction discovery in online health forums. J Assoc Inf Sci Technol 2017. [DOI: 10.1002/asi.23929] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Bousquet C, Dahamna B, Guillemin-Lanne S, Darmoni SJ, Faviez C, Huot C, Katsahian S, Leroux V, Pereira S, Richard C, Schück S, Souvignet J, Lillo-Le Louët A, Texier N. The Adverse Drug Reactions from Patient Reports in Social Media Project: Five Major Challenges to Overcome to Operationalize Analysis and Efficiently Support Pharmacovigilance Process. JMIR Res Protoc 2017;6:e179. [PMID: 28935617 PMCID: PMC5629348 DOI: 10.2196/resprot.6463] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2016] [Revised: 06/19/2017] [Accepted: 07/12/2017] [Indexed: 11/13/2022] Open

Abstract

Background

Adverse drug reactions (ADRs) are an important cause of morbidity and mortality. Classical Pharmacovigilance process is limited by underreporting which justifies the current interest in new knowledge sources such as social media. The Adverse Drug Reactions from Patient Reports in Social Media (ADR-PRISM) project aims to extract ADRs reported by patients in these media. We identified 5 major challenges to overcome to operationalize the analysis of patient posts: (1) variable quality of information on social media, (2) guarantee of data privacy, (3) response to pharmacovigilance expert expectations, (4) identification of relevant information within Web pages, and (5) robust and evolutive architecture.

Objective

This article aims to describe the current state of advancement of the ADR-PRISM project by focusing on the solutions we have chosen to address these 5 major challenges.

Methods

In this article, we propose methods and describe the advancement of this project on several aspects: (1) a quality driven approach for selecting relevant social media for the extraction of knowledge on potential ADRs, (2) an assessment of ethical issues and French regulation for the analysis of data on social media, (3) an analysis of pharmacovigilance expert requirements when reviewing patient posts on the Internet, (4) an extraction method based on natural language processing, pattern based matching, and selection of relevant medical concepts in reference terminologies, and (5) specifications of a component-based architecture for the monitoring system.

Results

Considering the 5 major challenges, we (1) selected a set of 21 validated criteria for selecting social media to support the extraction of potential ADRs, (2) proposed solutions to guarantee data privacy of patients posting on Internet, (3) took into account pharmacovigilance expert requirements with use case diagrams and scenarios, (4) built domain-specific knowledge resources embeding a lexicon, morphological rules, context rules, semantic rules, syntactic rules, and post-analysis processing, and (5) proposed a component-based architecture that allows storage of big data and accessibility to third-party applications through Web services.

Conclusions

We demonstrated the feasibility of implementing a component-based architecture that allows collection of patient posts on the Internet, near real-time processing of those posts including annotation, and storage in big data structures. In the next steps, we will evaluate the posts identified by the system in social media to clarify the interest and relevance of such approach to improve conventional pharmacovigilance processes based on spontaneous reporting.

Collapse

Golder S, Ahmed S, Norman G, Booth A. Attitudes Toward the Ethics of Research Using Social Media: A Systematic Review. J Med Internet Res 2017;19:e195. [PMID: 28588006 PMCID: PMC5478799 DOI: 10.2196/jmir.7082] [Citation(s) in RCA: 110] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2016] [Revised: 03/13/2017] [Accepted: 03/30/2017] [Indexed: 11/13/2022] Open

Abstract

BACKGROUND

Although primarily used for social networking and often used for social support and dissemination, data on social media platforms are increasingly being used to facilitate research. However, the ethical challenges in conducting social media research remain of great concern. Although much debated in the literature, it is the views of the public that are most pertinent to inform future practice.

OBJECTIVE

The aim of our study was to ascertain attitudes on the ethical considerations of using social media as a data source for research as expressed by social media users and researchers.

METHODS

A systematic review was conducted, wherein 16 databases and 2 Internet search engines were searched in addition to handsearching, reference checking, citation searching, and contacting authors and experts. Studies that conducted any qualitative methods to collect data on attitudes on the ethical implications of research using social media were included. Quality assessment was conducted using the quality of reporting tool (QuaRT) and findings analyzed using inductive thematic synthesis.

RESULTS

In total, 17 studies met the inclusion criteria. Attitudes varied from overly positive with people expressing the views about the essential nature of such research for the public good, to very concerned with views that social media research should not happen. Underlying reasons for this variation related to issues such as the purpose and quality of the research, the researcher affiliation, and the potential harms. The methods used to conduct the research were also important. Many respondents were positive about social media research while adding caveats such as the need for informed consent or use restricted to public platforms only.

CONCLUSIONS

Many conflicting issues contribute to the complexity of good ethical practice in social media research. However, this should not deter researchers from conducting social media research. Each Internet research project requires an individual assessment of its own ethical issues. Guidelines on ethical conduct should be based on current evidence and standardized to avoid discrepancies between, and duplication across, different institutions, taking into consideration different jurisdictions.

Collapse

Krallinger M, Rabal O, Lourenço A, Oyarzabal J, Valencia A. Information Retrieval and Text Mining Technologies for Chemistry. Chem Rev 2017;117:7673-7761. [PMID: 28475312 DOI: 10.1021/acs.chemrev.6b00851] [Citation(s) in RCA: 111] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]

Alvaro N, Miyao Y, Collier N. TwiMed: Twitter and PubMed Comparable Corpus of Drugs, Diseases, Symptoms, and Their Relations. JMIR Public Health Surveill 2017;3:e24. [PMID: 28468748 PMCID: PMC5438461 DOI: 10.2196/publichealth.6396] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2016] [Revised: 11/24/2016] [Accepted: 03/20/2017] [Indexed: 12/11/2022] Open

Clinicians' Reports in Electronic Health Records Versus Patients' Concerns in Social Media: A Pilot Study of Adverse Drug Reactions of Aspirin and Atorvastatin. Drug Saf 2016;39:241-50. [PMID: 26715498 DOI: 10.1007/s40264-015-0381-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

A New Data Representation Based on Training Data Characteristics to Extract Drug Name Entity in Medical Text. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2016;2016:3483528. [PMID: 27843447 PMCID: PMC5098107 DOI: 10.1155/2016/3483528] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/27/2016] [Revised: 08/08/2016] [Accepted: 09/18/2016] [Indexed: 11/18/2022]

Korkontzelos I, Nikfarjam A, Shardlow M, Sarker A, Ananiadou S, Gonzalez GH. Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts. J Biomed Inform 2016;62:148-58. [PMID: 27363901 PMCID: PMC4981644 DOI: 10.1016/j.jbi.2016.06.007] [Citation(s) in RCA: 58] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Revised: 06/03/2016] [Accepted: 06/22/2016] [Indexed: 12/03/2022]

Abstract

•

Sentiment analysis features are useful in spotting adverse drug reactions in text.

•

Sentiment analysis features help to distinguish adverse drug reactions and indications.

•

Posts about adverse drug reactions are associated with negative feelings.

Objective

The abundance of text available in social media and health related forums along with the rich expression of public opinion have recently attracted the interest of the public health community to use these sources for pharmacovigilance. Based on the intuition that patients post about Adverse Drug Reactions (ADRs) expressing negative sentiments, we investigate the effect of sentiment analysis features in locating ADR mentions.

Methods

We enrich the feature space of a state-of-the-art ADR identification method with sentiment analysis features. Using a corpus of posts from the DailyStrength forum and tweets annotated for ADR and indication mentions, we evaluate the extent to which sentiment analysis features help in locating ADR mentions and distinguishing them from indication mentions.

Results

Evaluation results show that sentiment analysis features marginally improve ADR identification in tweets and health related forum posts. Adding sentiment analysis features achieved a statistically significant F-measure increase from 72.14% to 73.22% in the Twitter part of an existing corpus using its original train/test split. Using stratified 10 × 10-fold cross-validation, statistically significant F-measure increases were shown in the DailyStrength part of the corpus, from 79.57% to 80.14%, and in the Twitter part of the corpus, from 66.91% to 69.16%. Moreover, sentiment analysis features are shown to reduce the number of ADRs being recognized as indications.

Conclusion

This study shows that adding sentiment analysis features can marginally improve the performance of even a state-of-the-art ADR identification method. This improvement can be of use to pharmacovigilance practice, due to the rapidly increasing popularity of social media and health forums.

Collapse

Bravo À, Li TS, Su AI, Good BM, Furlong LI. Combining machine learning, crowdsourcing and expert knowledge to detect chemical-induced diseases in text. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016;2016:baw094. [PMID: 27307137 PMCID: PMC4908671 DOI: 10.1093/database/baw094] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/04/2015] [Accepted: 05/10/2016] [Indexed: 01/13/2023]

Golder S, Norman G, Loke YK. Systematic review on the prevalence, frequency and comparative value of adverse events data in social media. Br J Clin Pharmacol 2015;80:878-88. [PMID: 26271492 PMCID: PMC4594731 DOI: 10.1111/bcp.12746] [Citation(s) in RCA: 66] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2015] [Revised: 07/16/2015] [Accepted: 08/03/2015] [Indexed: 11/27/2022] Open

Sloane R, Osanlou O, Lewis D, Bollegala D, Maskell S, Pirmohamed M. Social media and pharmacovigilance: A review of the opportunities and challenges. Br J Clin Pharmacol 2015;80:910-20. [PMID: 26147850 PMCID: PMC4594734 DOI: 10.1111/bcp.12717] [Citation(s) in RCA: 68] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2015] [Revised: 06/29/2015] [Accepted: 07/03/2015] [Indexed: 01/23/2023] Open

Lardon J, Abdellaoui R, Bellet F, Asfari H, Souvignet J, Texier N, Jaulent MC, Beyens MN, Burgun A, Bousquet C. Adverse Drug Reaction Identification and Extraction in Social Media: A Scoping Review. J Med Internet Res 2015;17:e171. [PMID: 26163365 PMCID: PMC4526988 DOI: 10.2196/jmir.4304] [Citation(s) in RCA: 67] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2015] [Revised: 04/09/2015] [Accepted: 04/22/2015] [Indexed: 02/06/2023] Open

Abstract

Background

The underreporting of adverse drug reactions (ADRs) through traditional reporting channels is a limitation in the efficiency of the current pharmacovigilance system. Patients’ experiences with drugs that they report on social media represent a new source of data that may have some value in postmarketing safety surveillance.

Objective

A scoping review was undertaken to explore the breadth of evidence about the use of social media as a new source of knowledge for pharmacovigilance.

Methods

Daubt et al’s recommendations for scoping reviews were followed. The research questions were as follows: How can social media be used as a data source for postmarketing drug surveillance? What are the available methods for extracting data? What are the different ways to use these data? We queried PubMed, Embase, and Google Scholar to extract relevant articles that were published before June 2014 and with no lower date limit. Two pairs of reviewers independently screened the selected studies and proposed two themes of review: manual ADR identification (theme 1) and automated ADR extraction from social media (theme 2). Descriptive characteristics were collected from the publications to create a database for themes 1 and 2.

Results

Of the 1032 citations from PubMed and Embase, 11 were relevant to the research question. An additional 13 citations were added after further research on the Internet and in reference lists. Themes 1 and 2 explored 11 and 13 articles, respectively. Ways of approaching the use of social media as a pharmacovigilance data source were identified.

Conclusions

This scoping review noted multiple methods for identifying target data, extracting them, and evaluating the quality of medical information from social media. It also showed some remaining gaps in the field. Studies related to the identification theme usually failed to accurately assess the completeness, quality, and reliability of the data that were analyzed from social media. Regarding extraction, no study proposed a generic approach to easily adding a new site or data source. Additional studies are required to precisely determine the role of social media in the pharmacovigilance system.

Collapse

Sarker A, Ginn R, Nikfarjam A, O'Connor K, Smith K, Jayaraman S, Upadhaya T, Gonzalez G. Utilizing social media data for pharmacovigilance: A review. J Biomed Inform 2015;54:202-12. [PMID: 25720841 DOI: 10.1016/j.jbi.2015.02.004] [Citation(s) in RCA: 238] [Impact Index Per Article: 26.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2014] [Revised: 01/02/2015] [Accepted: 02/15/2015] [Indexed: 10/23/2022]

Abstract

OBJECTIVE

Automatic monitoring of Adverse Drug Reactions (ADRs), defined as adverse patient outcomes caused by medications, is a challenging research problem that is currently receiving significant attention from the medical informatics community. In recent years, user-posted data on social media, primarily due to its sheer volume, has become a useful resource for ADR monitoring. Research using social media data has progressed using various data sources and techniques, making it difficult to compare distinct systems and their performances. In this paper, we perform a methodical review to characterize the different approaches to ADR detection/extraction from social media, and their applicability to pharmacovigilance. In addition, we present a potential systematic pathway to ADR monitoring from social media.

METHODS

We identified studies describing approaches for ADR detection from social media from the Medline, Embase, Scopus and Web of Science databases, and the Google Scholar search engine. Studies that met our inclusion criteria were those that attempted to extract ADR information posted by users on any publicly available social media platform. We categorized the studies according to different characteristics such as primary ADR detection approach, size of corpus, data source(s), availability, and evaluation criteria.

RESULTS

Twenty-two studies met our inclusion criteria, with fifteen (68%) published within the last two years. However, publicly available annotated data is still scarce, and we found only six studies that made the annotations used publicly available, making system performance comparisons difficult. In terms of algorithms, supervised classification techniques to detect posts containing ADR mentions, and lexicon-based approaches for extraction of ADR mentions from texts have been the most popular.

CONCLUSION

Our review suggests that interest in the utilization of the vast amounts of available social media data for ADR monitoring is increasing. In terms of sources, both health-related and general social media data have been used for ADR detection-while health-related sources tend to contain higher proportions of relevant data, the volume of data from general social media websites is significantly higher. There is still very limited amount of annotated data publicly available , and, as indicated by the promising results obtained by recent supervised learning approaches, there is a strong need to make such data available to the research community.

Collapse