1
|
Link S, Kammler A, Gupta R, Hembade M, Kumar R, George V. Enhancing the Efficiency of the Individual Case Safety Report (ICSR) Quality and Compliance through Automation. Curr Drug Saf 2024; 19:255-260. [PMID: 37533250 DOI: 10.2174/1574886318666230801162002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 06/05/2023] [Accepted: 06/19/2023] [Indexed: 08/04/2023]
Abstract
BACKGROUND Over the past few years, major inspection findings have been identified in the "management of adverse reactions" that may be due to increasing workload in pharmaceutical organizations impacting the correctness of information in individual case safety reports (ICSRs). Although retrospective quality check (Retro-QC) and late submission analyses are important steps in ensuring ICSR quality, their manual application poses several challenges that can be overcome through automation. OBJECTIVES To improve the efficiency of the Retro-QC analysis and late submission analysis using a computer-operated tool called Compliance and Metrics Management (CMM) tool, and to measure the tool's effectiveness in terms of productivity, time, and cost savings by comparing against the manual process. METHODS Time savings were calculated by measuring the difference in time taken during the manual process versus the automated process. Cost savings were measured in terms of hourly remuneration for the time saved. Productivity was calculated as the difference between the number of cases handled in the manual versus automated process. Thus, the overall efficiency was measured in terms of time and cost savings along with increased productivity. RESULTS Automation resulted in time savings of 49% and cost savings of 43% for Retro-QC analysis, and the productivity level increased by 67%. For late submission analysis, the CMM tool resulted in time savings of 88% and cost savings of 87%. CONCLUSION CMM tool enhanced the efficiency of both Retro-QC and late submission analyses by increasing productivity along with time and cost savings. It also reduced the number of errors, thereby enhancing the accuracy of the process and overall compliance.
Collapse
Affiliation(s)
- Shannon Link
- Global Pharmacovigilance Compliance and Business Management (Compliance & Analytics), Otsuka Pharmaceutical Development and Commercialization Inc., Princeton, New Jersey, 08540, USA
| | - Adam Kammler
- Global Pharmacovigilance Compliance and Business Management (Compliance & Analytics), Otsuka Pharmaceutical Development and Commercialization Inc., Princeton, New Jersey, 08540, USA
| | - Ritu Gupta
- Product Strategy, Vitrana Inc., Cranbury, New Jersey, 08512, USA
| | - Mahendra Hembade
- Global Pharmacovigilance Compliance and Business Management (Compliance & Analytics), Tata Consultancy Services, Thane, Maharashtra, 400607, India
| | - Retesh Kumar
- Global Pharmacovigilance Medical Safety, Otsuka Pharmaceutical Europe Limited (OPEL). Gallions Wexham Springs, Framewood Rd, Wexham, SL3 6PJ, United Kingdom
| | - Vinu George
- Global Pharmacovigilance, Otsuka Pharmaceutical Development and Commercialization Inc., Princeton, New Jersey, 08540, USA
| |
Collapse
|
2
|
Zulbayar S, Mollayeva T, Colantonio A, Chan V, Escobar M. Integrating unsupervised and supervised learning techniques to predict traumatic brain injury: A population-based study. INTELLIGENCE-BASED MEDICINE 2023; 8:100118. [PMID: 38222038 PMCID: PMC10785655 DOI: 10.1016/j.ibmed.2023.100118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Abstract
This work aimed to identify pre-existing health conditions of patients with traumatic brain injury (TBI) and develop predictive models for the first TBI event and its external causes by employing a combination of unsupervised and supervised learning algorithms. We acquired up to five years of pre-injury diagnoses for 488,107 patients with TBI and 488,107 matched control patients who entered the emergency department or acute care hospitals between April 1st, 2002, and March 31st, 2020. Diagnoses were obtained from the Ontario Health Insurance Plan (OHIP) database which contains province-wide claims data by physicians in Ontario, Canada for inpatient and outpatient services. A screening process was conducted on the OHIP diagnostic codes to limit the subsequent analysis to codes that were predictive of TBI, which concluded that 314 codes were significantly associated with TBI. The Latent Dirichlet Allocation (LDA) model was applied to the diagnostic codes and generated an optimal number of 19 topics that concur with published literature but also suggest other unexplored areas. Estimated word-topic probabilities from the LDA model helped us detect pre-morbid conditions among patients with TBI by uncovering the underlying patterns of diagnoses, meanwhile estimated document-topic probabilities were utilized in variable creation as form of a dimension reduction. We created 19 topic scores for each patient in the cohort which were utilized along with socio-demographic factors for Random Forest binary classifier models. Test set performances evaluated using area under the receiver operating characteristic curve (AUC) were: TBI event (AUC = 0.85), external cause of injury: falls (AUC = 0.85), struck by/against (AUC = 0.83), cyclist collision (AUC = 0.76), motor vehicle collision (AUC = 0.83). Our analysis successfully demonstrated the feasibility of using machine learning to predict TBI due to various external causes and identified the most important factors that contribute to this prediction.
Collapse
Affiliation(s)
- Suvd Zulbayar
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON M5T 3M7, Canada
- Institute of Health and Policy, Management and Evaluation, University of Toronto, M5T 3M6, Canada
| | - Tatyana Mollayeva
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON M5T 3M7, Canada
- Rehabilitation Sciences Institute, Temerty Faculty of Medicine, University of Toronto, Toronto, ON M5G 1V7, Canada
- Acquired Brain Injury Research Lab, Department of Occupational Science and Occupational Therapy, University of Toronto, Toronto, ON M5G 1V7, Canada
- KITE Research Institute, Toronto Rehabilitation Institute-University Health Network, Toronto, ON M5G 2A2, Canada
| | - Angela Colantonio
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON M5T 3M7, Canada
- Rehabilitation Sciences Institute, Temerty Faculty of Medicine, University of Toronto, Toronto, ON M5G 1V7, Canada
- Acquired Brain Injury Research Lab, Department of Occupational Science and Occupational Therapy, University of Toronto, Toronto, ON M5G 1V7, Canada
- KITE Research Institute, Toronto Rehabilitation Institute-University Health Network, Toronto, ON M5G 2A2, Canada
- Institute of Health and Policy, Management and Evaluation, University of Toronto, M5T 3M6, Canada
- ICES, Toronto, ON, M4N 3M5, Canada
| | - Vincy Chan
- Rehabilitation Sciences Institute, Temerty Faculty of Medicine, University of Toronto, Toronto, ON M5G 1V7, Canada
- Acquired Brain Injury Research Lab, Department of Occupational Science and Occupational Therapy, University of Toronto, Toronto, ON M5G 1V7, Canada
- KITE Research Institute, Toronto Rehabilitation Institute-University Health Network, Toronto, ON M5G 2A2, Canada
- Institute of Health and Policy, Management and Evaluation, University of Toronto, M5T 3M6, Canada
| | - Michael Escobar
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON M5T 3M7, Canada
| |
Collapse
|
3
|
Chen D, Zhang R. COVID-19 Vaccine Adverse Event Detection Based on Multi-Label Classification With Various Label Selection Strategies. IEEE J Biomed Health Inform 2023; 27:4192-4203. [PMID: 37418397 DOI: 10.1109/jbhi.2023.3292252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/09/2023]
Abstract
Analyzing massive VAERS reports without medical context may lead to incorrect conclusions about vaccine adverse events (VAE). Facilitating VAE detection promotes continual safety improvement for new vaccines. This study proposes a multi-label classification method with various term-and topic-based label selection strategies to improve the accuracy and efficiency of VAE detection. Topic modeling methods are first used to generate rule-based label dependencies from Medical Dictionary for Regulatory Activities terms in VAE reports with two hyper-parameters. Multiple label selection strategies, namely one-vs-rest (OvsR), problem transformation (PT), algorithm adaption (AA), and deep learning (DL) methods, are used in multi-label classification to examine the model performance, respectively. Experimental results indicated that the topic-based PT methods improve the accuracy by up to 33.69% using a COVID-19 VAE reporting data set, which improves the robustness and interpretability of our models. In addition, the topic-based OvsR methods achieve an optimal accuracy of up to 98.88%. The accuracy of the AA methods with topic-based labels increased by up to 87.36%. By contrast, the state-of-art LSTM- and BERT-based DL methods have relatively poor performance with accuracy rates of 71.89% and 64.63%, respectively. Our findings reveal that the proposed method effectively improves the model accuracy and strengthens VAE interpretability by using different label selection strategies and domain knowledge in multi-label classification for VAE detection.
Collapse
|
4
|
Wang B, Gao Z, Lin Z, Wang R. A Disease-Prediction Protocol Integrating Triage Priority and BERT-Based Transfer Learning for Intelligent Triage. Bioengineering (Basel) 2023; 10:bioengineering10040420. [PMID: 37106606 PMCID: PMC10136349 DOI: 10.3390/bioengineering10040420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 03/17/2023] [Accepted: 03/24/2023] [Indexed: 03/29/2023] Open
Abstract
Large hospitals can be complex, with numerous discipline and subspecialty settings. Patients may have limited medical knowledge, making it difficult for them to determine which department to visit. As a result, visits to the wrong departments and unnecessary appointments are common. To address this issue, modern hospitals require a remote system capable of performing intelligent triage, enabling patients to perform self-service triage. To address the challenges outlined above, this study presents an intelligent triage system based on transfer learning, capable of processing multilabel neurological medical texts. The system predicts a diagnosis and corresponding department based on the patient’s input. It utilizes the triage priority (TP) method to label diagnostic combinations found in medical records, converting a multilabel problem into a single-label one. The system considers disease severity and reduces the “class overlapping” of the dataset. The BERT model classifies the chief complaint text, predicting a primary diagnosis corresponding to the complaint. To address data imbalance, a composite loss function based on cost-sensitive learning is added to the BERT architecture. The study results indicate that the TP method achieves a classification accuracy of 87.47% on medical record text, outperforming other problem transformation methods. By incorporating the composite loss function, the system’s accuracy rate improves to 88.38% surpassing other loss functions. Compared to traditional methods, this system does not introduce significant complexity, yet substantially improves triage accuracy, reduces patient input confusion, and enhances hospital triage capabilities, ultimately improving the patient’s medical experience. The findings could provide a reference for intelligent triage development.
Collapse
|
5
|
Vaccine Vigilance System: Considerations on the Effectiveness of Vigilance Data Use in COVID-19 Vaccination. Vaccines (Basel) 2022; 10:vaccines10122115. [PMID: 36560525 PMCID: PMC9783025 DOI: 10.3390/vaccines10122115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Revised: 12/08/2022] [Accepted: 12/08/2022] [Indexed: 12/14/2022] Open
Abstract
(1) Background: The safety of medicines has been receiving increased attention to ensure that the risks of taking medicines do not outweigh the benefits. This is the reason why, over several decades, the pharmacovigilance system has been developed. The post-authorization pharmacovigilance system is based on reports from healthcare professionals and patients on observed adverse reactions. The reports are collected in databases and progressively evaluated. However, there are emerging concerns about the effectiveness of the established passive pharmacovigilance system in accelerating circumstances, such as the COVID-19 pandemic, when billions of doses of new vaccines were administered without a long history of use. Currently, health professionals receive fragmented new information on the safety of medicines from competent authorities after a lengthy evaluation process. Simultaneously, in the context of accelerated mass vaccination, health professionals need to have access to operational information-at least on organ systems at higher risk. Therefore, the aim of this study was to perform a primary data analysis of publicly available data on suspected COVID-19 vaccine-related adverse reactions in Europe, in order to identify the predominant groups of reported medical conditions after vaccination and their association with vaccine groups, as well as to evaluate the data accessibility on specific syndromes. (2) Methods: To achieve the objectives, the data publicly available in the EudraVigilance European Database for Suspected Adverse Drug Reaction Reports were analyzed. The following tasks were defined to: (1) Identify the predominant groups of medical conditions mentioned in adverse reaction reports; (2) determine the relative frequency of reports within vaccine groups; (3) assess the feasibility of obtaining information on a possibly associated syndrome-myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS). (3) Results: The data obtained demonstrate that the predominant medical conditions induced after vaccination are relevant to the following categories: (1) "General disorders and administration site conditions", (2) "nervous system disorders", and (3) "musculoskeletal and connective tissue disorders". There are more reports for mRNA vaccines, but the relative frequency of reports per dose administered, is lower for this group of vaccines. Information on ME/CFS was not available, but reports of "chronic fatigue syndrome" are included in the database and accessible for primary analysis. (4) Conclusions: The information obtained on the predominantly reported medical conditions and the relevant vaccine groups may be useful for health professionals, patients, researchers, and medicine manufacturers. Policymakers could benefit from reflecting on the design of an active pharmacovigilance model, making full use of modern information technologies, including big data analysis of social media and networks for the detection of primary signals and building an early warning system.
Collapse
|
6
|
Roosan D, Law AV, Roosan MR, Li Y. Artificial Intelligent Context-Aware Machine-Learning Tool to Detect Adverse Drug Events from Social Media Platforms. J Med Toxicol 2022; 18:311-320. [PMID: 36097239 PMCID: PMC9492823 DOI: 10.1007/s13181-022-00906-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2022] [Revised: 07/15/2022] [Accepted: 07/18/2022] [Indexed: 10/14/2022] Open
Abstract
INTRODUCTION Pharmacovigilance (PV) has proven to detect post-marketing adverse drug events (ADE). Previous research used the natural language processing (NLP) tool to extract unstructured texts relevant to ADEs. However, texts without context reduce the efficiency of such algorithms. Our objective was to develop and validate an innovative NLP tool, aTarantula, using a context-aware machine-learning algorithm to detect existing ADEs from social media using an aggregated lexicon. METHOD aTarantula utilized FastText embeddings and an aggregated lexicon to extract contextual data from three patient forums (i.e., MedHelp, MedsChat, and PatientInfo) taking warfarin. The lexicon used warfarin package inserts and synonyms of warfarin ADEs from UMLS and FAERS databases. Data was stored on SQLite and then refined and manually checked by three clinical pharmacists for validation. RESULTS Multiple organ systems where the most frequent ADE were reported at 1.50%, followed by CNS side effects at 1.19%. Lymphatic system ADEs were the least common side effect reported at 0.09%. The overall Spearman rank correlation coefficient between patient-reported data from the forums and FAERS was 0.19. As determined by pharmacist validation, aTarantula had a sensitivity of 84.2% and a specificity of 98%. Three clinical pharmacists manually validated our results. Finally, we created an aggregated lexicon for mining ADEs from social media. CONCLUSION We successfully developed aTarantula, a machine-learning algorithmn based on artificial intelligence to extract warfarin-related ADEs from online social discussion forums automatically. Our study shows that it is feasible to use aTarantula to detect ADEs. Future researchers can validate aTarantula on the diverse dataset.
Collapse
Affiliation(s)
- Don Roosan
- Department of Pharmacy Practice and Administration, Western University of Health Sciences, 309 E 2nd St, Pomona, CA, 91766, USA.
| | - Anandi V Law
- Department of Pharmacy Practice and Administration, Western University of Health Sciences, 309 E 2nd St, Pomona, CA, 91766, USA
| | - Moom R Roosan
- Department of Pharmacy Practice, Chapman University, 9401 Geronimo Rd, Irvine, CA, 92618, USA
| | - Yan Li
- Center for Information Systems and Technology, Claremont Graduate University, 150 E 19th St, Claremont, CA, 91711, USA
| |
Collapse
|
7
|
Language-agnostic deep learning framework for automatic monitoring of population-level mental health from social networks. J Biomed Inform 2022; 133:104145. [PMID: 35908625 DOI: 10.1016/j.jbi.2022.104145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 06/27/2022] [Accepted: 07/15/2022] [Indexed: 11/21/2022]
Abstract
In many countries, mental health issues are among the most serious public health concerns. National mental health statistics are frequently collected from reported patient cases or government-sponsored surveys, which have restricted coverage, frequency, and timeliness. Many domains of study, including public healthcare and biomedical informatics, have recently adopted social media data as a feasible real-time alternative to traditional methods of gathering representative information at the population level in a variety of contexts. However, because of the limits of fundamental natural language processing tools and labeled corpora in countries with limited natural language resources, such as Thailand, implementing social media systems to monitor mental health signals could be challenging. This paper presents LAPoMM, a novel framework for monitoring real-time mental health indicators from social media data without using labeled datasets in low-resource languages. Specifically, we use cross-lingual methods to train language-agnostic models and validate our framework by examining cross-correlations between the aggregate predicted mental signals and real-world administrative data from Thailand's Department of Mental Health, which includes monthly depression patients and reported cases of suicidal attempts. A combination of a language-agnostic representation and a deep learning classification model outperforms all other cross-lingual techniques for recognizing various mental signals in Tweets, such as emotions, sentiments, and suicidal tendencies. The correlation analyses discover a strong positive relationship between actual depression cases and the predicted negative sentiment signals as well as suicide attempts and negative signals (e.g., fear, sadness, and disgust) and suicidal tendency. These findings establish the effectiveness of our proposed framework and its potential applications in monitoring population-level mental health using large-scale social media data. Furthermore, because the language-agnostic model utilized in the methodology is capable of supporting a wide range of languages, the proposed LAPoMM framework can be easily generalized for analogous applications in other countries with limited language resources.
Collapse
|
8
|
Identifying Adverse Drug Reaction-Related Text from Social Media: A Multi-View Active Learning Approach with Various Document Representations. INFORMATION 2022. [DOI: 10.3390/info13040189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Adverse drug reactions (ADRs) are a huge public health issue. Identifying text that mentions ADRs from a large volume of social media data is important. However, we need to address two challenges for high-performing ADR-related text detection: the data imbalance problem and the requirement of simultaneously using data-driven information and handcrafted information. Therefore, we propose an approach named multi-view active learning using domain-specific and data-driven document representations (MVAL4D), endeavoring to enhance the predictive capability and alleviate the requirement of labeled data. Specifically, a new view-generation mechanism is proposed to generate multiple views by simultaneously exploiting various document representations obtained using handcrafted feature engineering and by performing deep learning methods. Moreover, different from previous active learning studies in which all instances are chosen using the same selection criterion, MVAL4D adopts different criteria (i.e., confidence and informativeness) to select potentially positive instances and potentially negative instances for manual annotation. The experimental results verify the effectiveness of MVAL4D. The proposed approach can be generalized to many other text classification tasks. Moreover, it can offer a solid foundation for the ADR mention extraction task, and improve the feasibility of monitoring drug safety using social media data.
Collapse
|
9
|
An Attention-Based Multi-Representational Fusion Method for Social-Media-Based Text Classification. INFORMATION 2022. [DOI: 10.3390/info13040171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/10/2022] Open
Abstract
There exist various text-classification tasks using user-generated contents (UGC) on social media in the big data era. In view of advantages and disadvantages of feature-engineering-based machine-learning models and deep-learning models, we argue that fusing handcrafted-text representation via feature engineering and data-driven deep-text representations extracted by performing deep-learning methods is conducive to enhancing text-classification capability. Given the characteristics of different deep neural networks, their complementary effect needs to be investigated. Moreover, contributions of these representations need to be adaptively learned when it comes to addressing different tasks or predicting different samples. Therefore, in this paper, we propose a novel fused deep-neural-network architecture with a hierarchical attention mechanism for text classification with social media data. Specifically, in the context that handcraft features are available, we employ the attention mechanism to adaptively fuse totally data-driven-text representation and handcrafted representation. For the generation of the data-driven-text representation, we propose a data-driven encoder that fuses text representations derived from three deep-learning methods with the attention mechanism, to adaptively select discriminative representation and explore their complementary effect. To verify the effectiveness of our approach, we performed two text-classification tasks, i.e., identifying adverse drug reaction (ADR)-relevant tweets from social media and identifying comparative-relevant reviews from an E-commerce platform. Experimental results demonstrate that our approach outperforms other baselines.
Collapse
|
10
|
Bazzaz Abkenar S, Haghi Kashani M, Mahdipour E, Jameii SM. Big data analytics meets social media: A systematic review of techniques, open issues, and future directions. TELEMATICS AND INFORMATICS 2021; 57:101517. [PMID: 34887614 PMCID: PMC7553883 DOI: 10.1016/j.tele.2020.101517] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 09/18/2020] [Accepted: 10/07/2020] [Indexed: 11/25/2022]
Abstract
A comprehensive systematic review on social big data analytic approaches is provided. The main methods, pros, cons, evaluation methods, and parameters are discussed. A scientific taxonomy of social big data analytic approaches is presented. A detailed list of challenges and future research directions is outlined.
Social Networking Services (SNSs) connect people worldwide, where they communicate through sharing contents, photos, videos, posting their first-hand opinions, comments, and following their friends. Social networks are characterized by velocity, volume, value, variety, and veracity, the 5 V’s of big data. Hence, big data analytic techniques and frameworks are commonly exploited in Social Network Analysis (SNA). By the ever-increasing growth of social networks, the analysis of social data, to describe and find communication patterns among users and understand their behaviors, has attracted much attention. In this paper, we demonstrate how big data analytics meets social media, and a comprehensive review is provided on big data analytic approaches in social networks to search published studies between 2013 and August 2020, with 74 identified papers. The findings of this paper are presented in terms of main journals/conferences, yearly distributions, and the distribution of studies among publishers. Furthermore, the big data analytic approaches are classified into two main categories: Content-oriented approaches and network-oriented approaches. The main ideas, evaluation parameters, tools, evaluation methods, advantages, and disadvantages are also discussed in detail. Finally, the open challenges and future directions that are worth further investigating are discussed.
Collapse
Affiliation(s)
- Sepideh Bazzaz Abkenar
- Department of Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran
| | - Mostafa Haghi Kashani
- Department of Computer Engineering, Shahr-e-Qods Branch, Islamic Azad University, Tehran, Iran
| | - Ebrahim Mahdipour
- Department of Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran
| | - Seyed Mahdi Jameii
- Department of Computer Engineering, Shahr-e-Qods Branch, Islamic Azad University, Tehran, Iran
| |
Collapse
|
11
|
Walsh J, Cave J, Griffiths F. Spontaneously Generated Online Patient Experience of Modafinil: A Qualitative and NLP Analysis. Front Digit Health 2021; 3:598431. [PMID: 34713085 PMCID: PMC8521895 DOI: 10.3389/fdgth.2021.598431] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Accepted: 01/27/2021] [Indexed: 11/16/2022] Open
Abstract
Objective: To compare the findings from a qualitative and a natural language processing (NLP) based analysis of online patient experience posts on patient experience of the effectiveness and impact of the drug Modafinil. Methods: Posts (n = 260) from 5 online social media platforms where posts were publicly available formed the dataset/corpus. Three platforms asked posters to give a numerical rating of Modafinil. Thematic analysis: data was coded and themes generated. Data were categorized into PreModafinil, Acquisition, Dosage, and PostModafinil and compared to identify each poster's own view of whether taking Modafinil was linked to an identifiable outcome. We classified this as positive, mixed, negative, or neutral and compared this with numerical ratings. NLP: Corpus text was speech tagged and keywords and key terms extracted. We identified the following entities: drug names, condition names, symptoms, actions, and side-effects. We searched for simple relationships, collocations, and co-occurrences of entities. To identify causal text, we split the corpus into PreModafinil and PostModafinil and used n-gram analysis. To evaluate sentiment, we calculated the polarity of each post between −1 (negative) and +1 (positive). NLP results were mapped to qualitative results. Results: Posters had used Modafinil for 33 different primary conditions. Eight themes were identified: the reason for taking (condition or symptom), impact of symptoms, acquisition, dosage, side effects, other interventions tried or compared to, effectiveness of Modafinil, and quality of life outcomes. Posters reported perceived effectiveness as follows: 68% positive, 12% mixed, 18% negative. Our classification was consistent with poster ratings. Of the most frequent 100 keywords/keyterms identified by term extraction 88/100 keywords and 84/100 keyterms mapped directly to the eight themes. Seven keyterms indicated negation and temporal states. Sentiment was as follows 72% positive sentiment 4% neutral 24% negative. Matching of sentiment between the qualitative and NLP methods was accurate in 64.2% of posts. If we allow for one category difference matching was accurate in 85% of posts. Conclusions: User generated patient experience is a rich resource for evaluating real world effectiveness, understanding patient perspectives, and identifying research gaps. Both methods successfully identified the entities and topics contained in the posts. In contrast to current evidence, posters with a wide range of other conditions found Modafinil effective. Perceived causality and effectiveness were identified by both methods demonstrating the potential to augment existing knowledge.
Collapse
Affiliation(s)
- Julia Walsh
- Warwick Medical School, University of Warwick, Coventry, United Kingdom
| | - Jonathan Cave
- Department of Economics, University of Warwick, Coventry, United Kingdom
| | - Frances Griffiths
- Warwick Medical School, University of Warwick, Coventry, United Kingdom
| |
Collapse
|
12
|
A. Rahim AI, Ibrahim MI, Musa KI, Chua SL, Yaacob NM. Assessing Patient-Perceived Hospital Service Quality and Sentiment in Malaysian Public Hospitals Using Machine Learning and Facebook Reviews. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:9912. [PMID: 34574835 PMCID: PMC8466628 DOI: 10.3390/ijerph18189912] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 09/17/2021] [Accepted: 09/18/2021] [Indexed: 02/05/2023]
Abstract
Social media is emerging as a new avenue for hospitals and patients to solicit input on the quality of care. However, social media data is unstructured and enormous in volume. Moreover, no empirical research on the use of social media data and perceived hospital quality of care based on patient online reviews has been performed in Malaysia. The purpose of this study was to investigate the determinants of positive sentiment expressed in hospital Facebook reviews in Malaysia, as well as the association between hospital accreditation and sentiments expressed in Facebook reviews. From 2017 to 2019, we retrieved comments from 48 official public hospitals' Facebook pages. We used machine learning to build a sentiment analyzer and service quality (SERVQUAL) classifier that automatically classifies the sentiment and SERVQUAL dimensions. We utilized logistic regression analysis to determine our goals. We evaluated a total of 1852 reviews and our machine learning sentiment analyzer detected 72.1% of positive reviews and 27.9% of negative reviews. We classified 240 reviews as tangible, 1257 reviews as trustworthy, 125 reviews as responsive, 356 reviews as assurance, and 1174 reviews as empathy using our machine learning SERVQUAL classifier. After adjusting for hospital characteristics, all SERVQUAL dimensions except Tangible were associated with positive sentiment. However, no significant relationship between hospital accreditation and online sentiment was discovered. Facebook reviews powered by machine learning algorithms provide valuable, real-time data that may be missed by traditional hospital quality assessments. Additionally, online patient reviews offer a hitherto untapped indication of quality that may benefit all healthcare stakeholders. Our results confirm prior studies and support the use of Facebook reviews as an adjunct method for assessing the quality of hospital services in Malaysia.
Collapse
Affiliation(s)
- Afiq Izzudin A. Rahim
- Department of Community Medicine, School of Medical Science, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Kelantan, Malaysia; (A.I.A.R.); (K.I.M.)
| | - Mohd Ismail Ibrahim
- Department of Community Medicine, School of Medical Science, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Kelantan, Malaysia; (A.I.A.R.); (K.I.M.)
| | - Kamarul Imran Musa
- Department of Community Medicine, School of Medical Science, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Kelantan, Malaysia; (A.I.A.R.); (K.I.M.)
| | - Sook-Ling Chua
- Faculty of Computing and Informatics, Multimedia University, Persiaran Multimedia, Cyberjaya 63100, Selangor, Malaysia;
| | - Najib Majdi Yaacob
- Units of Biostatistics and Research Methodology, School of Medical Sciences, Health Campus, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Kelantan, Malaysia;
| |
Collapse
|
13
|
Schück S, Roustamal A, Gedik A, Voillot P, Foulquié P, Penfornis C, Job B. Assessing Patient Perceptions and Experiences of Paracetamol in France: Infodemiology Study Using Social Media Data Mining. J Med Internet Res 2021; 23:e25049. [PMID: 34255645 PMCID: PMC8314157 DOI: 10.2196/25049] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Revised: 03/24/2021] [Accepted: 04/25/2021] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND Individuals frequently turning to social media to discuss medical conditions and medication, sharing their experiences and information and asking questions among themselves. These online discussions can provide valuable insights into individual perceptions of medical treatment, and increasingly, studies are focusing on the potential use of this information to improve health care management. OBJECTIVE The objective of this infodemiology study was to identify social media posts mentioning paracetamol-containing products to develop a better understanding of patients' opinions and perceptions of the drug. METHODS Posts between January 2003 and March 2019 containing at least one mention of paracetamol were extracted from 18 French forums in May 2019 with the use of the Detec't (Kap Code) web crawler. Posts were then analyzed using the automated Detec't tool, which uses machine learning and text mining methods to inspect social media posts and extract relevant content. Posts were classified into groups: Paracetamol Only, Paracetamol and Opioids, Paracetamol and Others, and the Aggregate group. RESULTS Overall, 44,283 posts were analyzed from 20,883 different users. Post volume over the study period showed a peak in activity between 2009 and 2012, as well as a spike in 2017 in the Aggregate group. The number of posts tended to be higher during winter each year. Posts were made predominantly by women (14,897/20,883, 71.34%), with 12.00% (2507/20,883) made by men and 16.67% (3479/20,883) by individuals of unknown gender. The mean age of web users was 39 (SD 19) years. In the Aggregate group, pain was the most common medical concept discussed (22,257/37,863, 58.78%), and paracetamol risk was the most common discussion topic, addressed in 20.36% (8902/43,725) of posts. Doliprane was the most common medication mentioned (14,058/44,283, 31.74%) within the Aggregate group, and tramadol was the most commonly mentioned drug in combination with paracetamol in the Aggregate group (1038/19,587, 5.30%). The most common unapproved indication mentioned within the Paracetamol Only group was fatigue (190/616, with 16.32% positive for an unapproved indication), with reference to dependence made by 1.61% (136/8470) of the web users, accounting for 1.33% (171/12,843) of the posts in the Paracetamol Only group. Dependence mentions in the Paracetamol and Opioids group were provided by 6.94% (248/3576) of web users, accounting for 5.44% (342/6281) of total posts. Reference to overdose was made by 245 web users across 291 posts within the Paracetamol Only group. The most common potential adverse event detected was nausea (306/12843, 2.38%) within the Paracetamol Only group. CONCLUSIONS The use of social media mining with the Detec't tool provided valuable information on the perceptions and understanding of the web users, highlighting areas where providing more information for the general public on paracetamol, as well as other medications, may be of benefit.
Collapse
|
14
|
Satwika MV, Sushma DS, Jaiswal V, Asha S, Pal T. The Role of Advanced Technologies Supplemented with Traditional Methods in Pharmacovigilance Sciences. Recent Pat Biotechnol 2021; 15:34-50. [PMID: 33087036 DOI: 10.2174/1872208314666201021162704] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Revised: 08/05/2020] [Accepted: 09/21/2020] [Indexed: 11/22/2022]
Abstract
BACKGROUND The immediate automatic systemic monitoring and reporting of adverse drug reactions, improving the efficacy is the utmost need of the medical informatics community. The venturing of advanced digital technologies into the health sector has opened new avenues for rapid monitoring. In recent years, data shared through social media, mobile apps, and other social websites has increased manifolds requiring data mining techniques. OBJECTIVE The objective of this report is to highlight the role of advanced technologies together with the traditional methods to proactively aid in the early detection of adverse drug reactions concerned with drug safety and pharmacovigilance. METHODS A thorough search was conducted on papers and patents regarding pharmacovigilance. All articles with respect to the relevant subject were explored and mined from public repositories such as Pubmed, Google Scholar, Springer, ScienceDirect (Elsevier), Web of Science, etc. Results: The European Union's Innovative Medicines Initiative WEB-RADR project has emphasized the development of mobile applications and social media data for reporting adverse effects. Only relevant data has to be captured through the data mining algorithms (DMAs) as it plays an important role in timely prediction of risk with high accuracy using two popular approaches; the frequentist and Bayesian approach. Pharmacovigilance at the pre-marketing stage is useful for the prediction of adverse drug reactions in the early developmental stage of a drug. Later, post-marketing safety reports and clinical data reports are important to be monitored through electronic health records, prescription-event monitoring, spontaneous reporting databases, etc. Conclusion: The advanced technologies supplemented with traditional technologies are the need of the hour for evaluating a product's risk profile and reducing risk in population especially with comorbid conditions and on concomitant medications.
Collapse
Affiliation(s)
- Mandali V Satwika
- Department of Biotechnology, Vignan's Foundation for Science, Technology and Research (Deemed to be University), Vadlamudi, Guntur, 522213, Andhra Pradesh, India
| | - Dudala S Sushma
- Department of Biotechnology, Vignan's Foundation for Science, Technology and Research (Deemed to be University), Vadlamudi, Guntur, 522213, Andhra Pradesh, India
| | - Varun Jaiswal
- School of Electrical and Computer Science Engineering, Shoolini University, Solan, Himachal Pradesh, 173212, India
| | - Syed Asha
- Department of Biotechnology, Vignan's Foundation for Science, Technology and Research (Deemed to be University), Vadlamudi, Guntur, 522213, Andhra Pradesh, India
| | - Tarun Pal
- Department of Biotechnology, Vignan's Foundation for Science, Technology and Research (Deemed to be University), Vadlamudi, Guntur, 522213, Andhra Pradesh, India
| |
Collapse
|
15
|
Fairie P, Zhang Z, D'Souza AG, Walsh T, Quan H, Santana MJ. Categorising patient concerns using natural language processing techniques. BMJ Health Care Inform 2021; 28:e100274. [PMID: 34193519 PMCID: PMC8246286 DOI: 10.1136/bmjhci-2020-100274] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 05/20/2021] [Indexed: 11/23/2022] Open
Abstract
OBJECTIVES Patient feedback is critical to identify and resolve patient safety and experience issues in healthcare systems. However, large volumes of unstructured text data can pose problems for manual (human) analysis. This study reports the results of using a semiautomated, computational topic-modelling approach to analyse a corpus of patient feedback. METHODS Patient concerns were received by Alberta Health Services between 2011 and 2018 (n=76 163), regarding 806 care facilities in 163 municipalities, including hospitals, clinics, community care centres and retirement homes, in a province of 4.4 million. Their existing framework requires manual labelling of pre-defined categories. We applied an automated latent Dirichlet allocation (LDA)-based topic modelling algorithm to identify the topics present in these concerns, and thereby produce a framework-free categorisation. RESULTS The LDA model produced 40 topics which, following manual interpretation by researchers, were reduced to 28 coherent topics. The most frequent topics identified were communication issues causing delays (frequency: 10.58%), community care for elderly patients (8.82%), interactions with nurses (8.80%) and emergency department care (7.52%). Many patient concerns were categorised into multiple topics. Some were more specific versions of categories from the existing framework (eg, communication issues causing delays), while others were novel (eg, smoking in inappropriate settings). DISCUSSION LDA-generated topics were more nuanced than the manually labelled categories. For example, LDA found that concerns with community care were related to concerns about nursing for seniors, providing opportunities for insight and action. CONCLUSION Our findings outline the range of concerns patients share in a large health system and demonstrate the usefulness of using LDA to identify categories of patient concerns.
Collapse
Affiliation(s)
- Paul Fairie
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Alberta Strategy for Patient-Oriented Research Patient Engagement Platform, Calgary, Alberta, Canada
| | - Zilong Zhang
- Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Adam G D'Souza
- Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Alberta Health Services, Calgary, Alberta, Canada
| | - Tara Walsh
- Alberta Health Services, Calgary, Alberta, Canada
| | - Hude Quan
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Maria J Santana
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
- Alberta Strategy for Patient-Oriented Research Patient Engagement Platform, Calgary, Alberta, Canada
- Department of Pediatrics, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| |
Collapse
|
16
|
Nguyen AXL, Trinh XV, Wang SY, Wu AY. Determination of Patient Sentiment and Emotion in Ophthalmology: Infoveillance Tutorial on Web-Based Health Forum Discussions. J Med Internet Res 2021; 23:e20803. [PMID: 33999001 PMCID: PMC8167608 DOI: 10.2196/20803] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Revised: 08/27/2020] [Accepted: 03/16/2021] [Indexed: 01/26/2023] Open
Abstract
Background Clinical data in social media are an underused source of information with great potential to allow for a deeper understanding of patient values, attitudes, and preferences. Objective This tutorial aims to describe a novel, robust, and modular method for the sentiment analysis and emotion detection of free text from web-based forums and the factors to consider during its application. Methods We mined the discussion and user information of all posts containing search terms related to a medical subspecialty (oculoplastics) from MedHelp, the largest web-based platform for patient health forums. We used data cleaning and processing tools to define the relevant subset of results and prepare them for sentiment analysis. We executed sentiment and emotion analyses by using IBM Watson Natural Language Understanding to generate sentiment and emotion scores for the posts and their associated keywords. The keywords were aggregated using natural language processing tools. Results Overall, 39 oculoplastic-related search terms resulted in 46,381 eligible posts within 14,329 threads. Posts were written by 18,319 users (117 doctors; 18,202 patients) and included 201,611 associated keywords. Keywords that occurred ≥500 times in the corpus were used to identify the most prominent topics, including specific symptoms, medication, and complications. The sentiment and emotion scores of these keywords and eligible posts were analyzed to provide concrete examples of the potential of this methodology to allow for a better understanding of patients’ attitudes. The overall sentiment score reflects a positive, neutral, or negative sentiment, whereas the emotion scores (anger, disgust, fear, joy, and sadness) represent the likelihood of the presence of the emotion. In keyword grouping analyses, medical signs, symptoms, and diseases had the lowest overall sentiment scores (−0.598). Complications were highly associated with sadness (0.485). Forum posts mentioning body parts were related to sadness (0.416) and fear (0.321). Administration was the category with the highest anger score (0.146). The top 6 forum subgroups had an overall negative sentiment score; the most negative one was the Neurology forum, with a score of −0.438. The Undiagnosed Symptoms forum had the highest sadness score (0.448). The least likely fearful posts were those from the Eye Care forum, with a score of 0.260. The overall sentiment score was much more negative before the doctor replied. The anger, disgust, fear, and sadness emotion scores decreased in likelihood, whereas joy was slightly more likely to be expressed after doctors replied. Conclusions This report allows physicians and researchers to efficiently mine and perform sentiment analysis on social media to better understand patients’ perspectives and promote patient-centric care. Important factors to be considered during its application include evaluating the scope of the search; selecting search terms and understanding their linguistic usages; and establishing selection, filtering, and processing criteria for posts and keywords tailored to the desired results.
Collapse
Affiliation(s)
| | - Xuan-Vi Trinh
- Department of Computer Science, McGill University, Montreal, QC, Canada
| | - Sophia Y Wang
- Department of Ophthalmology, Byers Eye Institute, Stanford University, Palo Alto, CA, United States
| | - Albert Y Wu
- Department of Ophthalmology, Byers Eye Institute, Stanford University, Palo Alto, CA, United States
| |
Collapse
|
17
|
Abstract
AbstractTwitter produces a massive amount of data due to its popularity that is one of the reasons underlying big data problems. One of those problems is the classification of tweets due to use of sophisticated and complex language, which makes the current tools insufficient. We present our framework HTwitt, built on top of the Hadoop ecosystem, which consists of a MapReduce algorithm and a set of machine learning techniques embedded within a big data analytics platform to efficiently address the following problems: (1) traditional data processing techniques are inadequate to handle big data; (2) data preprocessing needs substantial manual effort; (3) domain knowledge is required before the classification; (4) semantic explanation is ignored. In this work, these challenges are overcome by using different algorithms combined with a Naïve Bayes classifier to ensure reliability and highly precise recommendations in virtualization and cloud environments. These features make HTwitt different from others in terms of having an effective and practical design for text classification in big data analytics. The main contribution of the paper is to propose a framework for building landslide early warning systems by pinpointing useful tweets and visualizing them along with the processed information. We demonstrate the results of the experiments which quantify the levels of overfitting in the training stage of the model using different sizes of real-world datasets in machine learning phases. Our results demonstrate that the proposed system provides high-quality results with a score of nearly 95% and meets the requirement of a Hadoop-based classification system.
Collapse
|
18
|
Al-Garadi MA, Yang YC, Cai H, Ruan Y, O'Connor K, Graciela GH, Perrone J, Sarker A. Text classification models for the automatic detection of nonmedical prescription medication use from social media. BMC Med Inform Decis Mak 2021; 21:27. [PMID: 33499852 PMCID: PMC7835447 DOI: 10.1186/s12911-021-01394-0] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 01/12/2021] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND Prescription medication (PM) misuse/abuse has emerged as a national crisis in the United States, and social media has been suggested as a potential resource for performing active monitoring. However, automating a social media-based monitoring system is challenging-requiring advanced natural language processing (NLP) and machine learning methods. In this paper, we describe the development and evaluation of automatic text classification models for detecting self-reports of PM abuse from Twitter. METHODS We experimented with state-of-the-art bi-directional transformer-based language models, which utilize tweet-level representations that enable transfer learning (e.g., BERT, RoBERTa, XLNet, AlBERT, and DistilBERT), proposed fusion-based approaches, and compared the developed models with several traditional machine learning, including deep learning, approaches. Using a public dataset, we evaluated the performances of the classifiers on their abilities to classify the non-majority "abuse/misuse" class. RESULTS Our proposed fusion-based model performs significantly better than the best traditional model (F1-score [95% CI]: 0.67 [0.64-0.69] vs. 0.45 [0.42-0.48]). We illustrate, via experimentation using varying training set sizes, that the transformer-based models are more stable and require less annotated data compared to the other models. The significant improvements achieved by our best-performing classification model over past approaches makes it suitable for automated continuous monitoring of nonmedical PM use from Twitter. CONCLUSIONS BERT, BERT-like and fusion-based models outperform traditional machine learning and deep learning models, achieving substantial improvements over many years of past research on the topic of prescription medication misuse/abuse classification from social media, which had been shown to be a complex task due to the unique ways in which information about nonmedical use is presented. Several challenges associated with the lack of context and the nature of social media language need to be overcome to further improve BERT and BERT-like models. These experimental driven challenges are represented as potential future research directions.
Collapse
Affiliation(s)
- Mohammed Ali Al-Garadi
- Department of Biomedical Informatics, School of Medicine, Emory University, 101 Woodruff Circle, Atlanta, GA, 30322, USA.
| | - Yuan-Chi Yang
- Department of Biomedical Informatics, School of Medicine, Emory University, 101 Woodruff Circle, Atlanta, GA, 30322, USA
| | - Haitao Cai
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Yucheng Ruan
- School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Karen O'Connor
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Gonzalez-Hernandez Graciela
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Jeanmarie Perrone
- Department of Emergency Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Abeed Sarker
- Department of Biomedical Informatics, School of Medicine, Emory University, 101 Woodruff Circle, Atlanta, GA, 30322, USA
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, 30322, USA
| |
Collapse
|
19
|
Zaman N, Goldberg DM, Abrahams AS, Essig RA. Facebook Hospital Reviews: Automated Service Quality Detection and Relationships with Patient Satisfaction. DECISION SCIENCES 2020. [DOI: 10.1111/deci.12479] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Affiliation(s)
- Nohel Zaman
- Loyola Marymount University 1 LMU Drive Los Angeles CA 90045
| | | | | | | |
Collapse
|
20
|
Huang HL, Hong SH, Tsai YC. Approaches to text mining for analyzing treatment plan of quit smoking with free-text medical records: A PRISMA-compliant meta-analysis. Medicine (Baltimore) 2020; 99:e20999. [PMID: 32702841 PMCID: PMC7373589 DOI: 10.1097/md.0000000000020999] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
BACKGROUND Smoking is a complex behavior associated with multiple factors such as personality, environment, genetics, and emotions. Text data are a rich source of information. However, pure text data requires substantial human resources and time to extract and apply the knowledge, resulting in many details not being discovered and used. This study proposes a novel approach that explores a text mining flow to capture the behavior of smokers quitting tobacco from their free-text medical records. More importantly, the paper examines the impact of these changes on smokers. The goal is to help smokers quit smoking. The study population included adult patients that were >20 years old of age who consulted the medical center's smoking cessation outpatient clinic from January to December 2016. A total of 246 patients visited the clinic in the study period. After excluding incomplete medical records or lost follow up, there were 141 patients included in the final analysis. There are 141 valid data points for patients who only treated once and patients with empty medical records. Two independent review authors will make the study selection based on the study eligibility criteria. Our participants are from all the patients that were involved in this study and the staff of Division of Family Medicine, National Taiwan University Hospital. Interventions and study appraisal are not required. METHODS The paper develops an algorithm for analyzing smoking cessation treatment plans documented in free-text medical records. The approach involves the development of an information extraction flow that uses a combination of data mining techniques, including text mining. It can use not only to help others quit smoking but also for other medical records with similar data elements. The Apriori associations of our algorithm from the text mining revealed several important clinical implications for physicians during smoking cessation. For example, an apparent association between nicotine replacement therapy (NRT) and other medications such as Inderal, Rivotril, Dogmatyl, and Solaxin. Inderal and Rivotril use in patients with anxiety disorders as anxiolytics frequently. RESULTS Finally, we find that the rules associating with NRT combination with blood tests may imply that the use of NRT combination therapy in smokers with chronic illness may result in lower abstinence. Further large-scale surveys comparing varenicline or bupropion with NRT combination in smokers with a chronic disease are warranted. The Apriori algorithm suffers from some weaknesses despite being transparent and straightforward. The main limitation is the costly wasting of time to hold a vast number of candidates sets with frequent itemsets, low minimum support, or large itemsets. CONCLUSION In the paper, the most visible areas for the therapeutic application of text mining are the integration and transfer of advances made in basic sciences, as well as a better understanding of the processes involved in smoking cessation. Text mining may also be useful for supporting decision-making processes associated with smoking cessation. Systematic review registration number is not registered.
Collapse
Affiliation(s)
- Hsien-Liang Huang
- Division of Family Medicine, National Taiwan University Hospital, Zhongzheng Dist
| | - Shi-Hao Hong
- Computer Science and Technology, HeFei University of Technology, Hefei, Anhui Province
| | - Yun-Cheng Tsai
- School of Big Data Management, Soochow University, Shihlin District, Taipei City, Taiwan (R.O.C.)
| |
Collapse
|
21
|
Correia RB, Wood IB, Bollen J, Rocha LM. Mining Social Media Data for Biomedical Signals and Health-Related Behavior. Annu Rev Biomed Data Sci 2020; 3:433-458. [PMID: 32550337 PMCID: PMC7299233 DOI: 10.1146/annurev-biodatasci-030320-040844] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Social media data have been increasingly used to study biomedical and health-related phenomena. From cohort-level discussions of a condition to population-level analyses of sentiment, social media have provided scientists with unprecedented amounts of data to study human behavior associated with a variety of health conditions and medical treatments. Here we review recent work in mining social media for biomedical, epidemiological, and social phenomena information relevant to the multilevel complexity of human health. We pay particular attention to topics where social media data analysis has shown the most progress, including pharmacovigilance and sentiment analysis, especially for mental health. We also discuss a variety of innovative uses of social media data for health-related applications as well as important limitations of social media data access and use.
Collapse
Affiliation(s)
- Rion Brattig Correia
- Instituto Gulbenkian de Cincia, 2780-156 Oeiras, Portugal
- Center for Social and Biomedical Complexity, Luddy School of Informatics, Computing & Engineering, Indiana University, Bloomington, Indiana 47408, USA
- CAPES Foundation, Ministry of Education of Brazil, 70040 Braslia DF, Brazil
| | - Ian B Wood
- Center for Social and Biomedical Complexity, Luddy School of Informatics, Computing & Engineering, Indiana University, Bloomington, Indiana 47408, USA
| | - Johan Bollen
- Center for Social and Biomedical Complexity, Luddy School of Informatics, Computing & Engineering, Indiana University, Bloomington, Indiana 47408, USA
| | - Luis M Rocha
- Instituto Gulbenkian de Cincia, 2780-156 Oeiras, Portugal
- Center for Social and Biomedical Complexity, Luddy School of Informatics, Computing & Engineering, Indiana University, Bloomington, Indiana 47408, USA
| |
Collapse
|
22
|
Zhang Y, Cui S, Gao H. Adverse drug reaction detection on social media with deep linguistic features. J Biomed Inform 2020; 106:103437. [PMID: 32360987 DOI: 10.1016/j.jbi.2020.103437] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2019] [Revised: 04/02/2020] [Accepted: 04/26/2020] [Indexed: 11/26/2022]
Abstract
Adverse reactions caused by drugs are one of the most important public health problems. Social media has encouraged more patients to share their drug use experiences and has become a major source for the detection of professionally unreported adverse drug reactions (ADRs). Since a large number of user posts do not mention any ADR, accurate detection of the presence of ADRs in each user post is necessary before further research can be conducted. Previous feature-based methods focus on extracting more shallow linguistic features that are unable to capture deep and subtle information in the context, ultimately failing to provide satisfactory accuracy. To overcome the limitations of previous studies, this paper proposes a novel method that can extract deep linguistic features and then combine them with shallow linguistic features for ADR detection. We first extract predicate-ADR pairs under the guidance of extended syntactic dependencies and ADR lexicon. Then, we extract semantic and part-of-speech (POS) features for each pair and pool the features of different pairs to generate a holistic representation of deep linguistic features. Finally, we use the collection of deep features and several shallow features to train the predictive models. A series of experiments are performed on data sets collected from DailyStrength and Twitter. Our approach can achieve AUCs of 94.44% and 88.97% on the two data sets, respectively, outperforming other state-of-the-art methods. The results demonstrate the potential benefits of deep linguistic features for ADR detection on social data. This method can be applied to multiple other healthcare and text analysis tasks and can be used to support pharmacovigilance research.
Collapse
Affiliation(s)
- Ying Zhang
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China; School of Business, University of Jinan, Jinan 250022, China.
| | - Shaoze Cui
- School of Economics and Management, Dalian University of Technology, Dalian 116023, China.
| | - Huiying Gao
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China.
| |
Collapse
|
23
|
Daluwatte C, Schotland P, Strauss DG, Burkhart KK, Racz R. Predicting potential adverse events using safety data from marketed drugs. BMC Bioinformatics 2020; 21:163. [PMID: 32349656 PMCID: PMC7191698 DOI: 10.1186/s12859-020-3509-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Accepted: 04/22/2020] [Indexed: 11/26/2022] Open
Abstract
BACKGROUND While clinical trials are considered the gold standard for detecting adverse events, often these trials are not sufficiently powered to detect difficult to observe adverse events. We developed a preliminary approach to predict 135 adverse events using post-market safety data from marketed drugs. Adverse event information available from FDA product labels and scientific literature for drugs that have the same activity at one or more of the same targets, structural and target similarities, and the duration of post market experience were used as features for a classifier algorithm. The proposed method was studied using 54 drugs and a probabilistic approach of performance evaluation using bootstrapping with 10,000 iterations. RESULTS Out of 135 adverse events, 53 had high probability of having high positive predictive value. Cross validation showed that 32% of the model-predicted safety label changes occurred within four to nine years of approval (median: six years). CONCLUSIONS This approach predicts 53 serious adverse events with high positive predictive values where well-characterized target-event relationships exist. Adverse events with well-defined target-event associations were better predicted compared to adverse events that may be idiosyncratic or related to secondary target effects that were poorly captured. Further enhancement of this model with additional features, such as target prediction and drug binding data, may increase accuracy.
Collapse
Affiliation(s)
- Chathuri Daluwatte
- Division of Applied Regulatory Science, Food and Drug Administration, 10903 New Hampshire Ave, Silver Spring, MD 20993 USA
| | - Peter Schotland
- Office of New Drugs, Food and Drug Administration, Silver Spring, MD USA
| | - David G. Strauss
- Division of Applied Regulatory Science, Food and Drug Administration, 10903 New Hampshire Ave, Silver Spring, MD 20993 USA
| | - Keith K. Burkhart
- Division of Applied Regulatory Science, Food and Drug Administration, 10903 New Hampshire Ave, Silver Spring, MD 20993 USA
| | - Rebecca Racz
- Division of Applied Regulatory Science, Food and Drug Administration, 10903 New Hampshire Ave, Silver Spring, MD 20993 USA
| |
Collapse
|
24
|
|
25
|
Lewis DJ, McCallum JF. Utilizing Advanced Technologies to Augment Pharmacovigilance Systems: Challenges and Opportunities. Ther Innov Regul Sci 2019; 54:888-899. [PMID: 32557311 PMCID: PMC7362887 DOI: 10.1007/s43441-019-00023-3] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2019] [Accepted: 11/04/2019] [Indexed: 01/01/2023]
Abstract
There are significant challenges and opportunities in deploying and utilizing advanced information technology (IT) within pharmacovigilance (PV) systems and across the pharmaceutical industry. Various aspects of PV will benefit from automation (e.g., by improving standardization or increasing data quality). Several themes are developed, highlighting the challenges faced, exploring solutions, and assessing the potential for further research. Automation of the workflow for processing of individual case safety reports (ICSRs) is adopted as a use case. This involves a logical progression through a series of steps that when linked together comprise the complete work process required for the effective management of ICSRs. We recognize that the rapid development of new technologies will invariably outpace the regulations applicable to PV systems. Nevertheless, we believe that such systems may be improved by intelligent automation. It is incumbent on the owners of these systems to explore opportunities presented by new technologies with regulators in order to evaluate the applicability, design, deployment, performance, validation and maintenance of advanced technologies to ensure that the PV system continues to be fit for purpose. Proposed approaches to the validation of automated PV systems are presented. A series of definitions and a critical appraisal of important considerations are provided in the form of use cases. We summarize progress made and opportunities for the development of automation of future systems. The overall goal of automation is to provide high quality safety data in the correct format, in context, more quickly, and with less manual effort. This will improve the evidence available for scientific assessment and helps to inform and expedite decisions about the minimization of risks associated with medicines.
Collapse
Affiliation(s)
- David John Lewis
- Novartis Global Drug Development, Novartis Pharma GmbH, Oeflinger Strasse 44, D-79664, Wehr, Germany. .,Department of Pharmacy, Pharmacology and Postgraduate Medicine, University of Hertfordshire, Hatfield, Hertfordshire, AL10 9AB, UK.
| | - John Fraser McCallum
- Product Development Safety Risk Management, Roche Products Limited, 6 Falcon Way, Shire Park, Welwyn Garden City, Hertfordshire, AL7 1TW, UK
| |
Collapse
|
26
|
da Silva DA, ten Caten CS, dos Santos RP, Fogliatto FS, Hsuan J. Predicting the occurrence of surgical site infections using text mining and machine learning. PLoS One 2019; 14:e0226272. [PMID: 31834905 PMCID: PMC6910696 DOI: 10.1371/journal.pone.0226272] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Accepted: 11/22/2019] [Indexed: 12/11/2022] Open
Abstract
In this study we propose the use of text mining and machine learning methods to predict and detect Surgical Site Infections (SSIs) using textual descriptions of surgeries and post-operative patients’ records, mined from the database of a high complexity University hospital. SSIs are among the most common adverse events experienced by hospitalized patients; preventing such events is fundamental to ensure patients’ safety. Knowledge on SSI occurrence rates may also be useful in preventing future episodes. We analyzed 15,479 surgery descriptions and post-operative records testing different preprocessing strategies and the following machine learning algorithms: Linear SVC, Logistic Regression, Multinomial Naive Bayes, Nearest Centroid, Random Forest, Stochastic Gradient Descent, and Support Vector Classification (SVC). For prediction purposes, the best result was obtained using the Stochastic Gradient Descent method (79.7% ROC-AUC); for detection, Logistic Regression yielded the best performance (80.6% ROC-AUC).
Collapse
Affiliation(s)
- Daniel A. da Silva
- Industrial Engineering Department, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Carla S. ten Caten
- Industrial Engineering Department, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | | | - Flavio S. Fogliatto
- Industrial Engineering Department, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
- * E-mail:
| | | |
Collapse
|
27
|
Li S, Yu CH, Wang Y, Babu Y. Exploring adverse drug reactions of diabetes medicine using social media analytics and interactive visualizations. INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT 2019. [DOI: 10.1016/j.ijinfomgt.2018.12.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
28
|
Dai HJ, Wang CK. Classifying adverse drug reactions from imbalanced twitter data. Int J Med Inform 2019; 129:122-132. [PMID: 31445246 DOI: 10.1016/j.ijmedinf.2019.05.017] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2018] [Revised: 04/07/2019] [Accepted: 05/21/2019] [Indexed: 10/26/2022]
Abstract
BACKGROUND Nowadays, social media are often being used by general public to create and share public messages related to their health. With the global increase in social media usage, there is a trend of posting information related to adverse drug reactions (ADR). Mining the social media data for this type of information will be helpful for pharmacological post-marketing surveillance and monitoring. Although the concept of using social media to facilitate pharmacovigilance is convincing, construction of automatic ADR detection systems remains a challenge because the corpora compiled from social media tend to be highly imbalanced, posing a major obstacle to the development of classifiers with reliable performance. METHODS Several methods have been proposed to address the challenge of imbalanced corpora. However, we are not aware of any studies that investigated the effectiveness of the strategies of dealing with the problem of imbalanced data in the context of ADR detection from social media. In light of this, we evaluated a variety of imbalanced techniques and proposed a novel word embedding-based synthetic minority over-sampling technique (WESMOTE), which synthesizes new training examples from the sentence representation based on word embeddings. We compared the performance of all methods on two large imbalanced datasets released for the purpose of detecting ADR posts. RESULTS In comparison with the state-of-the-art approaches, the classifiers that incorporated imbalanced classification techniques achieved comparable or better F-scores. All of our best performing configurations combined random under-sampling with techniques including the proposed WESMOTE, boosting and ensemble, implying that an integration of these approaches with under-sampling provides a reliable solution for large imbalanced social media datasets. Furthermore, ensemble-based methods like vote-based under-sampling (VUE) and random under-sampling boosting can be alternatives for the hybrid synthetic methods because both methods increase the diversity of the created weak classifiers, leading to better recall and overall F-scores for the minority classes. CONCLUSIONS Data collected from the social media are usually very large and highly imbalanced. In order to maximize the performance of a classifier trained on such data, applications of imbalanced strategies are required. We considered several practical methods for handling imbalanced Twitter data along with their performance on the binary classification task with respect to ADRs. In conclusion, the following practical insights are gained: 1) When dealing with text classification, the proposed word embedding-based synthetic minority over-sampling technique is more effective than traditional synthetic-based over-sampling methods. 2) In cases where large amounts of training data are available, the imbalanced strategies combined with under-sampling techniques are preferred. 3) Finally, employment of advanced methods does not guarantee better performance than simpler ones such as VUE, which achieved high performance with advantages like faster building time and ease of development.
Collapse
Affiliation(s)
- Hong-Jie Dai
- Department of Electrical Engineering, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan, Republic of China; Post Baccalaureate Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan, Republic of China.
| | - Chen-Kai Wang
- Big Data laboratories of Chunghwa Telecom Laboratories, Taoyuan, Taiwan, Republic of China.
| |
Collapse
|
29
|
Harnessing social media data for pharmacovigilance: a review of current state of the art, challenges and future directions. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS 2019. [DOI: 10.1007/s41060-019-00175-3] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
|
30
|
Wang CS, Lin PJ, Cheng CL, Tai SH, Kao Yang YH, Chiang JH. Detecting Potential Adverse Drug Reactions Using a Deep Neural Network Model. J Med Internet Res 2019; 21:e11016. [PMID: 30724742 PMCID: PMC6381404 DOI: 10.2196/11016] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2018] [Revised: 10/03/2018] [Accepted: 11/04/2018] [Indexed: 01/17/2023] Open
Abstract
BACKGROUND Adverse drug reactions (ADRs) are common and are the underlying cause of over a million serious injuries and deaths each year. The most familiar method to detect ADRs is relying on spontaneous reports. Unfortunately, the low reporting rate of spontaneous reports is a serious limitation of pharmacovigilance. OBJECTIVE The objective of this study was to identify a method to detect potential ADRs of drugs automatically using a deep neural network (DNN). METHODS We designed a DNN model that utilizes the chemical, biological, and biomedical information of drugs to detect ADRs. This model aimed to fulfill two main purposes: identifying the potential ADRs of drugs and predicting the possible ADRs of a new drug. For improving the detection performance, we distributed representations of the target drugs in a vector space to capture the drug relationships using the word-embedding approach to process substantial biomedical literature. Moreover, we built a mapping function to address new drugs that do not appear in the dataset. RESULTS Using the drug information and the ADRs reported up to 2009, we predicted the ADRs of drugs recorded up to 2012. There were 746 drugs and 232 new drugs, which were only recorded in 2012 with 1325 ADRs. The experimental results showed that the overall performance of our model with mean average precision at top-10 achieved is 0.523 and the rea under the receiver operating characteristic curve (AUC) score achieved is 0.844 for ADR prediction on the dataset. CONCLUSIONS Our model is effective in identifying the potential ADRs of a drug and the possible ADRs of a new drug. Most importantly, it can detect potential ADRs irrespective of whether they have been reported in the past.
Collapse
Affiliation(s)
- Chi-Shiang Wang
- Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan
| | - Pei-Ju Lin
- Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan
| | - Ching-Lan Cheng
- School of Pharmacy, College of Medicine, National Cheng Kung University, Tainan, Taiwan.,Institute of Clinical Pharmacy and Pharmaceutical Sciences, College of Medicine, National Cheng Kung University, Tainan, Taiwan.,Department of Pharmacy, National Cheng Kung University Hospital, National Cheng Kung University, Tainan, Taiwan
| | - Shu-Hua Tai
- Department of Pharmacy, National Cheng Kung University Hospital, National Cheng Kung University, Tainan, Taiwan
| | - Yea-Huei Kao Yang
- School of Pharmacy, College of Medicine, National Cheng Kung University, Tainan, Taiwan.,Institute of Clinical Pharmacy and Pharmaceutical Sciences, College of Medicine, National Cheng Kung University, Tainan, Taiwan
| | - Jung-Hsien Chiang
- Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan.,Institute of Medical Informatics, National Cheng Kung University, Tainan, Taiwan
| |
Collapse
|
31
|
Dutilleul A, Morel J, Schilte C, Launay O, Autran B, Béhier JM, Borel T, Bresse X, Chêne G, Courcier S, Dufour V, Faurisson F, Gagneur A, Gelpi O, Gérald F, Kheloufi F, Koeck JL, Lamarque-Garnier V, Lery T, Ménin G, Molimard M, Opinel A, Roger C, Rouby F, Schuck S, Simon L, Soubeyrand B, Truchet MC. How to improve vaccine acceptability (evaluation, pharmacovigilance, communication, public health, mandatory vaccination, fears and beliefs). Therapie 2019; 74:131-140. [DOI: 10.1016/j.therap.2018.12.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Accepted: 11/19/2018] [Indexed: 10/27/2022]
|
32
|
Dutilleul A, Morel J, Schilte C, Launay O, Autran B, Béhier JM, Borel T, Bresse X, Chêne G, Courcier S, Dufour V, Faurisson F, Gagneur A, Gelpi O, Gérald F, Kheloufi F, Koeck JL, Lamarque-Garnier V, Lery T, Ménin G, Molimard M, Opinel A, Roger C, Rouby F, Schuck S, Simon L, Soubeyrand B, Truchet MC. Comment améliorer l’acceptabilité vaccinale (évaluation, pharmacovigilance, communication, santé publique, obligation vaccinale, peurs et croyances). Therapie 2019; 74:119-129. [DOI: 10.1016/j.therap.2018.11.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
33
|
Schmider J, Kumar K, LaForest C, Swankoski B, Naim K, Caubel PM. Innovation in Pharmacovigilance: Use of Artificial Intelligence in Adverse Event Case Processing. Clin Pharmacol Ther 2018; 105:954-961. [PMID: 30303528 PMCID: PMC6590385 DOI: 10.1002/cpt.1255] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Accepted: 09/25/2018] [Indexed: 12/03/2022]
Abstract
Automation of pharmaceutical safety case processing represents a significant opportunity to affect the strongest cost driver for a company's overall pharmacovigilance budget. A pilot was undertaken to test the feasibility of using artificial intelligence and robotic process automation to automate processing of adverse event reports. The pilot paradigm was used to simultaneously test proposed solutions of three commercial vendors. The result confirmed the feasibility of using artificial intelligence–based technology to support extraction from adverse event source documents and evaluation of case validity. In addition, the pilot demonstrated viability of the use of safety database data fields as a surrogate for otherwise time‐consuming and costly direct annotation of source documents. Finally, the evaluation and scoring method used in the pilot was able to differentiate vendor capabilities and identify the best candidate to move into the discovery phase.
Collapse
Affiliation(s)
| | - Krishan Kumar
- Pfizer Business Technology, Artificial Intelligence Center of Excellence, La Jolla, California, USA
| | - Chantal LaForest
- Pfizer Global Product Development, Safety Solutions, Kirkland, Quebec, Ontario, Canada
| | - Brian Swankoski
- Pfizer Finance and Business Operations, Peapack, New Jersey, USA
| | - Karen Naim
- Pfizer R&D, Collegeville, Pennsylvania, USA
| | | |
Collapse
|
34
|
Zhou S, Kang H, Yao B, Gong Y. An automated pipeline for analyzing medication event reports in clinical settings. BMC Med Inform Decis Mak 2018; 18:113. [PMID: 30526590 PMCID: PMC6284273 DOI: 10.1186/s12911-018-0687-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND Medication events in clinical settings are significant threats to patient safety. Analyzing and learning from the medication event reports is an important way to prevent the recurrence of these events. Currently, the analysis of medication event reports is ineffective and requires heavy workloads for clinicians. An automated pipeline is proposed to help clinicians deal with the accumulated reports, extract valuable information and generate feedback from the reports. Thus, the strategy of medication event prevention can be further developed based on the lessons learned. METHODS In order to build the automated pipeline, four classic machine learning classifiers (i.e., support vector machine, Naïve Bayes, random forest, and multi-layer perceptron) were compared to identify the event originating stages, event types, and event causes from the medication event reports. The precision, recall and F-1 measure were calculated to assess the performance of the classifiers. Further, a strategy to measure the similarity of medication event reports in our pipeline was established and evaluated by human subjects through a questionnaire. RESULTS We developed three classifiers to identify the medication event originating stages, event types and causes, respectively. For the event originating stages, a support vector machine classifier obtains the best performance with an F-1 measure of 0.792. For the event types, a support vector machine classifier exhibits the best performance with an F-1 measure of 0.758. And for the event causes, a random forest classifier reaches an F-1 measure of 0.925. The questionnaire results show that the similarity measurement is consistent with the domain experts in the task of identifying similar reports. CONCLUSION We developed and evaluated an automated pipeline that could identify three attributes from the medication event reports and calculate the similarity scores between the reports based on the attributes. The pipeline is expected to improve the efficiency of analyzing the medication event reports and to learn from the reports in a timely manner.
Collapse
Affiliation(s)
- Sicheng Zhou
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, 7000 Fannin Street, Suite 600, Houston, 77030, TX, USA
| | - Hong Kang
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, 7000 Fannin Street, Suite 600, Houston, 77030, TX, USA
| | - Bin Yao
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, 7000 Fannin Street, Suite 600, Houston, 77030, TX, USA
| | - Yang Gong
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, 7000 Fannin Street, Suite 600, Houston, 77030, TX, USA.
| |
Collapse
|
35
|
Affiliation(s)
- Xin Zheng
- School of Computer Science and Engineering, Nanyang Technological University; Singapore 639798
- SAP Innovation Center Network, SAP Asia Pte Ltd; Singapore 119968
| | - Aixin Sun
- School of Computer Science and Engineering, Nanyang Technological University; Singapore 639798
| |
Collapse
|
36
|
Hoang T, Liu J, Pratt N, Zheng VW, Chang KC, Roughead E, Li J. Authenticity and credibility aware detection of adverse drug events from social media. Int J Med Inform 2018; 120:157-171. [PMID: 30409341 DOI: 10.1016/j.ijmedinf.2018.10.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2017] [Revised: 09/11/2018] [Accepted: 10/09/2018] [Indexed: 11/16/2022]
Abstract
OBJECTIVES Adverse drug events (ADEs) are among the top causes of hospitalization and death. Social media is a promising open data source for the timely detection of potential ADEs. In this paper, we study the problem of detecting signals of ADEs from social media. METHODS Detecting ADEs whose drug and AE may be reported in different posts of a user leads to major concerns regarding the content authenticity and user credibility, which have not been addressed in previous studies. Content authenticity concerns whether a post mentions drugs or adverse events that are actually consumed or experienced by the writer. User credibility indicates the degree to which chronological evidence from a user's sequence of posts should be trusted in the ADE detection. We propose AC-SPASM, a Bayesian model for the authenticity and credibility aware detection of ADEs from social media. The model exploits the interaction between content authenticity, user credibility and ADE signal quality. In particular, we argue that the credibility of a user correlates with the user's consistency in reporting authentic content. RESULTS We conduct experiments on a real-world Twitter dataset containing 1.2 million posts from 13,178 users. Our benchmark set contains 22 drugs and 8089 AEs. AC-SPASM recognizes authentic posts with F1 - the harmonic mean of precision and recall of 80%, and estimates user credibility with precision@10 = 90% and NDCG@10 - a measure for top-10 ranking quality of 96%. Upon validation against known ADEs, AC-SPASM achieves F1 = 91%, outperforming state-of-the-art baseline models by 32% (p < 0.05). Also, AC-SPASM obtains precision@456 = 73% and NDCG@456 = 94% in detecting and prioritizing unknown potential ADE signals for further investigation. Furthermore, the results show that AC-SPASM is scalable to large datasets. CONCLUSIONS Our study demonstrates that taking into account the content authenticity and user credibility improves the detection of ADEs from social media. Our work generates hypotheses to reduce experts' guesswork in identifying unknown potential ADEs.
Collapse
Affiliation(s)
- Tao Hoang
- School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, South Australia 5095, Australia.
| | - Jixue Liu
- School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, South Australia 5095, Australia
| | - Nicole Pratt
- School of Pharmacy and Medical Sciences, University of South Australia, City East Campus, North Terrace, South Australia 5000, Australia
| | - Vincent W Zheng
- Advanced Digital Sciences Center, 1 Fusionopolis Way, #08-10 Connexis North Tower, Singapore 138632, Singapore
| | - Kevin C Chang
- Department of Computer Science, University of Illinois at Urbana-Champaign, 201 N Goodwin Ave, Urbana, IL 61801, United States
| | - Elizabeth Roughead
- School of Pharmacy and Medical Sciences, University of South Australia, City East Campus, North Terrace, South Australia 5000, Australia
| | - Jiuyong Li
- School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, South Australia 5095, Australia
| |
Collapse
|
37
|
Hoang T, Liu J, Pratt N, Zheng VW, Chang KC, Roughead E, Li J. Authenticity and credibility aware detection of adverse drug events from social media. Int J Med Inform 2018; 120:101-115. [PMID: 30409335 DOI: 10.1016/j.ijmedinf.2018.09.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2017] [Accepted: 09/03/2018] [Indexed: 11/29/2022]
Abstract
OBJECTIVES Adverse drug events (ADEs) are among the top causes of hospitalization and death. Social media is a promising open data source for the timely detection of potential ADEs. In this paper, we study the problem of detecting signals of ADEs from social media. METHODS Detecting ADEs whose drug and AE may be reported in different posts of a user leads to major concerns regarding the content authenticity and user credibility, which have not been addressed in previous studies. Content authenticity concerns whether a post mentions drugs or adverse events that are actually consumed or experienced by the writer. User credibility indicates the degree to which chronological evidence from a user's sequence of posts should be trusted in the ADE detection. We propose AC-SPASM, a Bayesian model for the authenticity and credibility aware detection of ADEs from social media. The model exploits the interaction between content authenticity, user credibility and ADE signal quality. In particular, we argue that the credibility of a user correlates with the user's consistency in reporting authentic content. RESULTS We conduct experiments on a real-world Twitter dataset containing 1.2 million posts from 13,178 users. Our benchmark set contains 22 drugs and 8089 AEs. AC-SPASM recognizes authentic posts with F1 - the harmonic mean of precision and recall of 80%, and estimates user credibility with precision@10 = 90% and NDCG@10 - a measure for top-10 ranking quality of 96%. Upon validation against known ADEs, AC-SPASM achieves F1 = 91%, outperforming state-of-the-art baseline models by 32% (p < 0.05). Also, AC-SPASM obtains precision@456 = 73% and NDCG@456 = 94% in detecting and prioritizing unknown potential ADE signals for further investigation. Furthermore, the results show that AC-SPASM is scalable to large datasets. CONCLUSIONS Our study demonstrates that taking into account the content authenticity and user credibility improves the detection of ADEs from social media. Our work generates hypotheses to reduce experts' guesswork in identifying unknown potential ADEs.
Collapse
Affiliation(s)
- Tao Hoang
- School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, Adelaide, South Australia 5095, Australia.
| | - Jixue Liu
- School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, Adelaide, South Australia 5095, Australia
| | - Nicole Pratt
- School of Pharmacy and Medical Sciences, University of South Australia, City East Campus, North Terrace, Adelaide, South Australia 5000, Australia
| | - Vincent W Zheng
- Advanced Digital Sciences Center, 1 Fusionopolis Way, #08-10 Connexis North Tower, Singapore, 138632, Singapore
| | - Kevin C Chang
- Department of Computer Science, University of Illinois at Urbana-Champaign, 201 N Goodwin Ave, Urbana, IL 61801, United States
| | - Elizabeth Roughead
- School of Pharmacy and Medical Sciences, University of South Australia, City East Campus, North Terrace, Adelaide, South Australia 5000, Australia
| | - Jiuyong Li
- School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, Adelaide, South Australia 5095, Australia
| |
Collapse
|
38
|
Azam R. Accessing social media information for pharmacovigilance: what are the ethical implications? Ther Adv Drug Saf 2018; 9:385-387. [PMID: 30364758 DOI: 10.1177/2042098618778191] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Accepted: 04/30/2018] [Indexed: 11/16/2022] Open
Affiliation(s)
- Robina Azam
- PRA Health Science, 500 South Oak Way, Greenpark, Reading RG2 6AD, UK
| |
Collapse
|
39
|
Sahu SK, Anand A. Drug-drug interaction extraction from biomedical texts using long short-term memory network. J Biomed Inform 2018; 86:15-24. [DOI: 10.1016/j.jbi.2018.08.005] [Citation(s) in RCA: 85] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2018] [Accepted: 08/07/2018] [Indexed: 12/15/2022]
|
40
|
The Missing Variable in Big Data for Social Sciences: The Decision-Maker. SUSTAINABILITY 2018. [DOI: 10.3390/su10103415] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The value of big data for social sciences and social impact is professed to be high. This potential value is related, however, to the capacity of using extracted information in decision-making. In all of this, one important point has been overlooked: when “humans” retain a role in the decision-making process, the value of information is no longer an objective feature but depends on the knowledge and mindset of end users. A new big data cycle has been proposed in this paper, where the decision-maker is placed at the centre of the process. The proposed cycle is tested through two cases and, as a result of the suggested approach, two operations—filtering and framing—which are routinely carried out independently by scientists and end users in an unconscious manner, become clear and transparent. The result is a new cycle where four dimensions guide the interactions for creating value.
Collapse
|
41
|
Thompson P, Daikou S, Ueno K, Batista-Navarro R, Tsujii J, Ananiadou S. Annotation and detection of drug effects in text for pharmacovigilance. J Cheminform 2018; 10:37. [PMID: 30105604 PMCID: PMC6089860 DOI: 10.1186/s13321-018-0290-y] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2018] [Accepted: 07/20/2018] [Indexed: 02/02/2023] Open
Abstract
Pharmacovigilance (PV) databases record the benefits and risks of different drugs, as a means to ensure their safe and effective use. Creating and maintaining such resources can be complex, since a particular medication may have divergent effects in different individuals, due to specific patient characteristics and/or interactions with other drugs being administered. Textual information from various sources can provide important evidence to curators of PV databases about the usage and effects of drug targets in different medical subjects. However, the efficient identification of relevant evidence can be challenging, due to the increasing volume of textual data. Text mining (TM) techniques can support curators by automatically detecting complex information, such as interactions between drugs, diseases and adverse effects. This semantic information supports the quick identification of documents containing information of interest (e.g., the different types of patients in which a given adverse drug reaction has been observed to occur). TM tools are typically adapted to different domains by applying machine learning methods to corpora that are manually labelled by domain experts using annotation guidelines to ensure consistency. We present a semantically annotated corpus of 597 MEDLINE abstracts, PHAEDRA, encoding rich information on drug effects and their interactions, whose quality is assured through the use of detailed annotation guidelines and the demonstration of high levels of inter-annotator agreement (e.g., 92.6% F-Score for identifying named entities and 78.4% F-Score for identifying complex events, when relaxed matching criteria are applied). To our knowledge, the corpus is unique in the domain of PV, according to the level of detail of its annotations. To illustrate the utility of the corpus, we have trained TM tools based on its rich labels to recognise drug effects in text automatically. The corpus and annotation guidelines are available at: http://www.nactem.ac.uk/PHAEDRA/ .
Collapse
Affiliation(s)
- Paul Thompson
- National Centre for Text Mining, School of Computer Science, Manchester Institute of Biotechnology, University of Manchester, 131 Princess Street, Manchester, M1 7DN UK
| | - Sophia Daikou
- National Centre for Text Mining, School of Computer Science, Manchester Institute of Biotechnology, University of Manchester, 131 Princess Street, Manchester, M1 7DN UK
| | - Kenju Ueno
- Artificial Intelligence Research Center, National Research and Development Agency (AIST), Tokyo Waterfront 2-3-2 Aomi, Koto-ku, Tokyo, 135-0064 Japan
| | - Riza Batista-Navarro
- National Centre for Text Mining, School of Computer Science, Manchester Institute of Biotechnology, University of Manchester, 131 Princess Street, Manchester, M1 7DN UK
| | - Jun’ichi Tsujii
- National Centre for Text Mining, School of Computer Science, Manchester Institute of Biotechnology, University of Manchester, 131 Princess Street, Manchester, M1 7DN UK
- Artificial Intelligence Research Center, National Research and Development Agency (AIST), Tokyo Waterfront 2-3-2 Aomi, Koto-ku, Tokyo, 135-0064 Japan
| | - Sophia Ananiadou
- National Centre for Text Mining, School of Computer Science, Manchester Institute of Biotechnology, University of Manchester, 131 Princess Street, Manchester, M1 7DN UK
| |
Collapse
|
42
|
Lardon J, Bellet F, Aboukhamis R, Asfari H, Souvignet J, Jaulent MC, Beyens MN, Lillo-LeLouët A, Bousquet C. Evaluating Twitter as a complementary data source for pharmacovigilance. Expert Opin Drug Saf 2018; 17:763-774. [DOI: 10.1080/14740338.2018.1499724] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Affiliation(s)
- Jérémy Lardon
- Sorbonne Université, UPMC Université Paris 06, UMR_S 1142, LIMICS, Paris, France
- INSERM, U1142, LIMICS, Paris, France
- Université Paris 13, Sorbonne Paris Cité, LIMICS (UMR_S 1142), Bobigny, France
- Department of Public Health and medical informatics, CHU University of Saint-Etienne, Saint-Etienne, France
| | - Florelle Bellet
- Centre de Pharmacovigilance, Centre Hospitalier Universitaire (CHU) University Hospital of Saint-Etienne, Saint-Etienne, France
| | - Rim Aboukhamis
- Centre Régional de Pharmacovigilance, Hôpital Européen Georges Pompidou – Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Hadyl Asfari
- Sorbonne Université, UPMC Université Paris 06, UMR_S 1142, LIMICS, Paris, France
- INSERM, U1142, LIMICS, Paris, France
| | - Julien Souvignet
- Sorbonne Université, UPMC Université Paris 06, UMR_S 1142, LIMICS, Paris, France
- INSERM, U1142, LIMICS, Paris, France
- Université Paris 13, Sorbonne Paris Cité, LIMICS (UMR_S 1142), Bobigny, France
- Department of Public Health and medical informatics, CHU University of Saint-Etienne, Saint-Etienne, France
| | - Marie-Christine Jaulent
- Sorbonne Université, UPMC Université Paris 06, UMR_S 1142, LIMICS, Paris, France
- INSERM, U1142, LIMICS, Paris, France
- Université Paris 13, Sorbonne Paris Cité, LIMICS (UMR_S 1142), Bobigny, France
| | - Marie-Noëlle Beyens
- Centre de Pharmacovigilance, Centre Hospitalier Universitaire (CHU) University Hospital of Saint-Etienne, Saint-Etienne, France
| | - Agnès Lillo-LeLouët
- Centre Régional de Pharmacovigilance, Hôpital Européen Georges Pompidou – Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Cédric Bousquet
- Sorbonne Université, UPMC Université Paris 06, UMR_S 1142, LIMICS, Paris, France
- INSERM, U1142, LIMICS, Paris, France
- Université Paris 13, Sorbonne Paris Cité, LIMICS (UMR_S 1142), Bobigny, France
- Department of Public Health and medical informatics, CHU University of Saint-Etienne, Saint-Etienne, France
| |
Collapse
|
43
|
Liu J, Wang G. Pharmacovigilance from social media: An improved random subspace method for identifying adverse drug events. Int J Med Inform 2018; 117:33-43. [PMID: 30032963 DOI: 10.1016/j.ijmedinf.2018.06.008] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2018] [Revised: 05/10/2018] [Accepted: 06/12/2018] [Indexed: 11/17/2022]
Abstract
OBJECTIVE Recent advances in Web 2.0 technologies have seen significant strides towards utilizing patient-generated content for pharmacovigilance. Social media-based pharmacovigilance has great potential to augment current efforts and provide regulatory authorities with valuable decision aids. Among various pharmacovigilance activities, identifying adverse drug events (ADEs) is very important for patient safety. However, in health-related discussion forums, ADEs may confound with drug indications and beneficial effects, etc. Therefore, the focus of this study is to develop a strategy to identify ADEs from other semantic types, and meanwhile to determine the drug that an ADE is associated with. MATERIALS AND METHODS In this study, two groups of features, i.e., shallow linguistic features and semantic features, are explored. Moreover, motivated and inspired by the characteristics of explored two feature categories for social media-based ADE identification, an improved random subspace method, called Stratified Sampling-based Random Subspace (SSRS), is proposed. Unlike conventional random subspace method that applies random sampling for subspace selection, SSRS adopts stratified sampling-based subspace selection strategy. RESULTS A case study on heart disease discussion forums is performed to evaluate the effectiveness of the SSRS method. Experimental results reveal that the proposed SSRS method significantly outperforms other compared ensemble methods and existing approaches for ADE identification. DISCUSSION AND CONCLUSION Our proposed method is easy to implement since it is based on two feature sets that can be naturally derived, and therefore, can omit artificial stratum generation efforts. Moreover, SSRS has great potential of being applied to deal with other high-dimensional problems that can represent original data from two different aspects.
Collapse
Affiliation(s)
- Jing Liu
- School of Management Science and Engineering, Tianjin University of Finance and Economics, Tianjin 300222, PR China
| | - Gang Wang
- School of Management, Hefei University of Technology, Hefei, Anhui 230009, PR China.
| |
Collapse
|
44
|
Tricco AC, Zarin W, Lillie E, Jeblee S, Warren R, Khan PA, Robson R, Pham B, Hirst G, Straus SE. Utility of social media and crowd-intelligence data for pharmacovigilance: a scoping review. BMC Med Inform Decis Mak 2018; 18:38. [PMID: 29898743 PMCID: PMC6001022 DOI: 10.1186/s12911-018-0621-y] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2017] [Accepted: 05/31/2018] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND A scoping review to characterize the literature on the use of conversations in social media as a potential source of data for detecting adverse events (AEs) related to health products. METHODS Our specific research questions were (1) What social media listening platforms exist to detect adverse events related to health products, and what are their capabilities and characteristics? (2) What is the validity and reliability of data from social media for detecting these adverse events? MEDLINE, EMBASE, Cochrane Library, and relevant websites were searched from inception to May 2016. Any type of document (e.g., manuscripts, reports) that described the use of social media data for detecting health product AEs was included. Two reviewers independently screened citations and full-texts, and one reviewer and one verifier performed data abstraction. Descriptive synthesis was conducted. RESULTS After screening 3631 citations and 321 full-texts, 70 unique documents with 7 companion reports available from 2001 to 2016 were included. Forty-six documents (66%) described an automated or semi-automated information extraction system to detect health product AEs from social media conversations (in the developmental phase). Seven pre-existing information extraction systems to mine social media data were identified in eight documents. Nineteen documents compared AEs reported in social media data with validated data and found consistent AE discovery in all except two documents. None of the documents reported the validity and reliability of the overall system, but some reported on the performance of individual steps in processing the data. The validity and reliability results were found for the following steps in the data processing pipeline: data de-identification (n = 1), concept identification (n = 3), concept normalization (n = 2), and relation extraction (n = 8). The methods varied widely, and some approaches yielded better results than others. CONCLUSIONS Our results suggest that the use of social media conversations for pharmacovigilance is in its infancy. Although social media data has the potential to supplement data from regulatory agency databases; is able to capture less frequently reported AEs; and can identify AEs earlier than official alerts or regulatory changes, the utility and validity of the data source remains under-studied. TRIAL REGISTRATION Open Science Framework ( https://osf.io/kv9hu/ ).
Collapse
Affiliation(s)
- Andrea C. Tricco
- Li Ka Shing Knowledge Institute of St. Michael’s Hospital, 209 Victoria Street, East Building, Toronto, ON M5B 1W8 Canada
- Epidemiology Division, Dalla Lana School of Public Health, University of Toronto, 6th Floor, 155 College St, Toronto, ON M5T 3M7 Canada
| | - Wasifa Zarin
- Li Ka Shing Knowledge Institute of St. Michael’s Hospital, 209 Victoria Street, East Building, Toronto, ON M5B 1W8 Canada
| | - Erin Lillie
- Li Ka Shing Knowledge Institute of St. Michael’s Hospital, 209 Victoria Street, East Building, Toronto, ON M5B 1W8 Canada
| | - Serena Jeblee
- Department of Computer Science, University of Toronto, 10 King’s College Road, Toronto, ON M5S 3G4 Canada
| | - Rachel Warren
- Li Ka Shing Knowledge Institute of St. Michael’s Hospital, 209 Victoria Street, East Building, Toronto, ON M5B 1W8 Canada
| | - Paul A. Khan
- Li Ka Shing Knowledge Institute of St. Michael’s Hospital, 209 Victoria Street, East Building, Toronto, ON M5B 1W8 Canada
| | - Reid Robson
- Li Ka Shing Knowledge Institute of St. Michael’s Hospital, 209 Victoria Street, East Building, Toronto, ON M5B 1W8 Canada
| | - Ba’ Pham
- Li Ka Shing Knowledge Institute of St. Michael’s Hospital, 209 Victoria Street, East Building, Toronto, ON M5B 1W8 Canada
| | - Graeme Hirst
- Department of Computer Science, University of Toronto, 10 King’s College Road, Toronto, ON M5S 3G4 Canada
| | - Sharon E. Straus
- Li Ka Shing Knowledge Institute of St. Michael’s Hospital, 209 Victoria Street, East Building, Toronto, ON M5B 1W8 Canada
- Department of Geriatric Medicine, Faculty of Medicine, University of Toronto, 27 Kings College Circle, Toronto, ON M5S 1A1 Canada
| |
Collapse
|
45
|
Chen X, Faviez C, Schuck S, Lillo-Le-Louët A, Texier N, Dahamna B, Huot C, Foulquié P, Pereira S, Leroux V, Karapetiantz P, Guenegou-Arnoux A, Katsahian S, Bousquet C, Burgun A. Mining Patients' Narratives in Social Media for Pharmacovigilance: Adverse Effects and Misuse of Methylphenidate. Front Pharmacol 2018; 9:541. [PMID: 29881351 PMCID: PMC5978246 DOI: 10.3389/fphar.2018.00541] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2017] [Accepted: 05/04/2018] [Indexed: 12/29/2022] Open
Abstract
Background: The Food and Drug Administration (FDA) in the United States and the European Medicines Agency (EMA) have recognized social media as a new data source to strengthen their activities regarding drug safety. Objective: Our objective in the ADR-PRISM project was to provide text mining and visualization tools to explore a corpus of posts extracted from social media. We evaluated this approach on a corpus of 21 million posts from five patient forums, and conducted a qualitative analysis of the data available on methylphenidate in this corpus. Methods: We applied text mining methods based on named entity recognition and relation extraction in the corpus, followed by signal detection using proportional reporting ratio (PRR). We also used topic modeling based on the Correlated Topic Model to obtain the list of the matics in the corpus and classify the messages based on their topics. Results: We automatically identified 3443 posts about methylphenidate published between 2007 and 2016, among which 61 adverse drug reactions (ADR) were automatically detected. Two pharmacovigilance experts evaluated manually the quality of automatic identification, and a f-measure of 0.57 was reached. Patient's reports were mainly neuro-psychiatric effects. Applying PRR, 67% of the ADRs were signals, including most of the neuro-psychiatric symptoms but also palpitations. Topic modeling showed that the most represented topics were related to Childhood and Treatment initiation, but also Side effects. Cases of misuse were also identified in this corpus, including recreational use and abuse. Conclusion: Named entity recognition combined with signal detection and topic modeling have demonstrated their complementarity in mining social media data. An in-depth analysis focused on methylphenidate showed that this approach was able to detect potential signals and to provide better understanding of patients' behaviors regarding drugs, including misuse.
Collapse
Affiliation(s)
- Xiaoyi Chen
- UMRS 1138, équipe 22, Institut National de la Santé et de la Recherche Médicale, Centre de Recherche des Cordeliers, Université Paris Descartes, Paris, France
| | | | | | - Agnès Lillo-Le-Louët
- Centre Régional de Pharmacovigilance, Hôpital Européen Georges-Pompidou, AP-HP, Paris, France
| | | | - Badisse Dahamna
- Service d'Informatique Biomédicale, Centre Hospitalier Universitaire de Rouen, Rouen, France.,Laboratoire d'Informatique, du Traitement de l'Information et des Systèmes-TIBS EA 4108, Rouen, France
| | | | | | | | | | - Pierre Karapetiantz
- UMRS 1138, équipe 22, Institut National de la Santé et de la Recherche Médicale, Centre de Recherche des Cordeliers, Université Paris Descartes, Paris, France
| | - Armelle Guenegou-Arnoux
- UMRS 1138, équipe 22, Institut National de la Santé et de la Recherche Médicale, Centre de Recherche des Cordeliers, Université Paris Descartes, Paris, France
| | - Sandrine Katsahian
- UMRS 1138, équipe 22, Institut National de la Santé et de la Recherche Médicale, Centre de Recherche des Cordeliers, Université Paris Descartes, Paris, France.,Département d'Informatique Médicale, Hôpital Européen Georges Pompidou, Paris, France
| | - Cédric Bousquet
- Sorbonne Université, Inserm, université Paris 13, Laboratoire d'informatique médicale et d'ingénierie des connaissances en e-santé, LIMICS, Paris, France
| | - Anita Burgun
- UMRS 1138, équipe 22, Institut National de la Santé et de la Recherche Médicale, Centre de Recherche des Cordeliers, Université Paris Descartes, Paris, France.,Département d'Informatique Médicale, Hôpital Européen Georges Pompidou, Paris, France
| |
Collapse
|
46
|
Karapetiantz P, Bellet F, Audeh B, Lardon J, Leprovost D, Aboukhamis R, Morlane-Hondère F, Grouin C, Burgun A, Katsahian S, Jaulent MC, Beyens MN, Lillo-Le Louët A, Bousquet C. Descriptions of Adverse Drug Reactions Are Less Informative in Forums Than in the French Pharmacovigilance Database but Provide More Unexpected Reactions. Front Pharmacol 2018; 9:439. [PMID: 29765326 PMCID: PMC5938397 DOI: 10.3389/fphar.2018.00439] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Accepted: 04/13/2018] [Indexed: 01/28/2023] Open
Abstract
Background: Social media have drawn attention for their potential use in Pharmacovigilance. Recent work showed that it is possible to extract information concerning adverse drug reactions (ADRs) from posts in social media. The main objective of the Vigi4MED project was to evaluate the relevance and quality of the information shared by patients on web forums about drug safety and its potential utility for pharmacovigilance. Methods: After selecting websites of interest, we manually evaluated the relevance of the content of posts for pharmacovigilance related to six drugs (agomelatine, baclofen, duloxetine, exenatide, strontium ranelate, and tetrazepam). We compared forums to the French Pharmacovigilance Database (FPVD) to (1) evaluate whether they contained relevant information to characterize a pharmacovigilance case report (patient’s age and sex; treatment indication, dose and duration; time-to-onset (TTO) and outcome of the ADR, and drug dechallenge and rechallenge) and (2) perform impact analysis (nature, seriousness, unexpectedness, and outcome of the ADR). Results: The cases in the FPVD were significantly more informative than posts in forums for patient description (age, sex), treatment description (dose, duration, TTO), and outcome of the ADR, but the indication for the treatment was more often found in forums. Cases were more often serious in the FPVD than in forums (46% vs. 4%), but forums more often contained an unexpected ADR than the FPVD (24% vs. 17%). Moreover, 197 unexpected ADRs identified in forums were absent from the FPVD and the distribution of the MedDRA System Organ Classes (SOCs) was different between the two data sources. Discussion: This study is the first to evaluate if patients’ posts may qualify as potential and informative case reports that should be stored in a pharmacovigilance database in the same way as case reports submitted by health professionals. The posts were less informative (except for the indication) and focused on less serious ADRs than the FPVD cases, but more unexpected ADRs were presented in forums than in the FPVD and their SOCs were different. Thus, web forums should be considered as a secondary, but complementary source for pharmacovigilance.
Collapse
Affiliation(s)
- Pierre Karapetiantz
- Sorbonne Université, INSERM, Université Paris 13, Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances en e-Santé, Paris, France
| | - Florelle Bellet
- Centre Régional de Pharmacovigilance, Centre Hospitalier Universitaire de Saint-Étienne, Hôpital Nord, Saint-Étienne, France
| | - Bissan Audeh
- Université de Lyon, IMT Mines Saint-Etienne, Institut Henri Fayol, Département ISI, Université Jean Monnet, Institut d'Optique Graduate School, Centre National de la Recherche Scientifique, Laboratoire Hubert Curien, Saint-Étienne, France
| | - Jérémy Lardon
- Sorbonne Université, INSERM, Université Paris 13, Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances en e-Santé, Paris, France
| | - Damien Leprovost
- Sorbonne Université, INSERM, Université Paris 13, Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances en e-Santé, Paris, France
| | - Rim Aboukhamis
- Centre Régional de Pharmacovigilance, Hôpital Européen Georges-Pompidou, Assistance Publique - Hôpitaux de Paris, Paris, France
| | | | - Cyril Grouin
- LIMSI, CNRS, Université Paris-Saclay, Orsay, France
| | - Anita Burgun
- INSERM UMRS1138 Centre de Recherche des Cordeliers, Paris, France.,Département d'Informatique Médicale, Hôpital Européen Georges-Pompidou, Assistance Publique - Hôpitaux de Paris, Paris, France
| | - Sandrine Katsahian
- INSERM UMRS1138 Centre de Recherche des Cordeliers, Paris, France.,Département d'Informatique Médicale, Hôpital Européen Georges-Pompidou, Assistance Publique - Hôpitaux de Paris, Paris, France
| | - Marie-Christine Jaulent
- Sorbonne Université, INSERM, Université Paris 13, Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances en e-Santé, Paris, France
| | - Marie-Noëlle Beyens
- Centre Régional de Pharmacovigilance, Centre Hospitalier Universitaire de Saint-Étienne, Hôpital Nord, Saint-Étienne, France
| | - Agnès Lillo-Le Louët
- Centre Régional de Pharmacovigilance, Hôpital Européen Georges-Pompidou, Assistance Publique - Hôpitaux de Paris, Paris, France
| | - Cédric Bousquet
- Sorbonne Université, INSERM, Université Paris 13, Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances en e-Santé, Paris, France
| |
Collapse
|
47
|
Abstract
The digital world is generating data at a staggering and still increasing rate. While these "big data" have unlocked novel opportunities to understand public health, they hold still greater potential for research and practice. This review explores several key issues that have arisen around big data. First, we propose a taxonomy of sources of big data to clarify terminology and identify threads common across some subtypes of big data. Next, we consider common public health research and practice uses for big data, including surveillance, hypothesis-generating research, and causal inference, while exploring the role that machine learning may play in each use. We then consider the ethical implications of the big data revolution with particular emphasis on maintaining appropriate care for privacy in a world in which technology is rapidly changing social norms regarding the need for (and even the meaning of) privacy. Finally, we make suggestions regarding structuring teams and training to succeed in working with big data in research and practice.
Collapse
Affiliation(s)
- Stephen J Mooney
- Harborview Injury Prevention and Research Center, University of Washington, Seattle, Washington 98122, USA;
| | - Vikas Pejaver
- Department of Biomedical Informatics and Medical Education and the eScience Institute, University of Washington, Seattle, Washington 98109, USA;
| |
Collapse
|
48
|
Abdellaoui R, Foulquié P, Texier N, Faviez C, Burgun A, Schück S. Detection of Cases of Noncompliance to Drug Treatment in Patient Forum Posts: Topic Model Approach. J Med Internet Res 2018. [PMID: 29540337 PMCID: PMC5874436 DOI: 10.2196/jmir.9222] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Background Medication nonadherence is a major impediment to the management of many health conditions. A better understanding of the factors underlying noncompliance to treatment may help health professionals to address it. Patients use peer-to-peer virtual communities and social media to share their experiences regarding their treatments and diseases. Using topic models makes it possible to model themes present in a collection of posts, thus to identify cases of noncompliance. Objective The aim of this study was to detect messages describing patients’ noncompliant behaviors associated with a drug of interest. Thus, the objective was the clustering of posts featuring a homogeneous vocabulary related to nonadherent attitudes. Methods We focused on escitalopram and aripiprazole used to treat depression and psychotic conditions, respectively. We implemented a probabilistic topic model to identify the topics that occurred in a corpus of messages mentioning these drugs, posted from 2004 to 2013 on three of the most popular French forums. Data were collected using a Web crawler designed by Kappa Santé as part of the Detec’t project to analyze social media for drug safety. Several topics were related to noncompliance to treatment. Results Starting from a corpus of 3650 posts related to an antidepressant drug (escitalopram) and 2164 posts related to an antipsychotic drug (aripiprazole), the use of latent Dirichlet allocation allowed us to model several themes, including interruptions of treatment and changes in dosage. The topic model approach detected cases of noncompliance behaviors with a recall of 98.5% (272/276) and a precision of 32.6% (272/844). Conclusions Topic models enabled us to explore patients’ discussions on community websites and to identify posts related with noncompliant behaviors. After a manual review of the messages in the noncompliance topics, we found that noncompliance to treatment was present in 6.17% (276/4469) of the posts.
Collapse
Affiliation(s)
- Redhouane Abdellaoui
- Unité de Mixte de Recherche 1138 Team 22, Institut National de la Santé et de la Recherche Médicale / Université Pierre et Marie Curie, Paris, France
| | | | | | | | - Anita Burgun
- Unité de Mixte de Recherche 1138 Team 22, Institut National de la Santé et de la Recherche Médicale / Université Pierre et Marie Curie, Paris, France.,Medical Informatics, Hôpital Européen Georges-Pompidou, Assistance Publique-Hôpitaux de Paris, Paris, France
| | | |
Collapse
|
49
|
Zhou L, Zhang D, Yang C, Wang Y. HARNESSING SOCIAL MEDIA FOR HEALTH INFORMATION MANAGEMENT. ELECTRONIC COMMERCE RESEARCH AND APPLICATIONS 2018; 27:139-151. [PMID: 30147636 PMCID: PMC6105292 DOI: 10.1016/j.elerap.2017.12.003] [Citation(s) in RCA: 49] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
The remarkable upsurge of social media has dramatic impacts on health care research and practice in the past decade. Social media are reshaping health information management in a variety of ways, ranging from providing cost-effective ways to improve clinician-patient communication and exchange health-related information and experience, to enabling the discovery of new medical knowledge and information. Despite some demonstrated initial success, social media use and analytics for improving health as a research field is still at its infancy. Information systems researchers can potentially play a key role in advancing the field. This study proposes a conceptual framework for social media-based health information management by drawing on multi-disciplinary research. With the guidance of the framework, this research presents related research challenges, identifies important yet under-explored research issues, and discusses promising directions for future research.
Collapse
Affiliation(s)
- Lina Zhou
- University of Maryland, Baltimore County
| | - Dongsong Zhang
- International Business School, Jinan University, China
- University of Maryland, Baltimore County
| | | | - Yu Wang
- International Business School, Jinan University, China
| |
Collapse
|
50
|
Smith MY, Benattia I. The Patient's Voice in Pharmacovigilance: Pragmatic Approaches to Building a Patient-Centric Drug Safety Organization. Drug Saf 2017; 39:779-85. [PMID: 27098248 PMCID: PMC4982890 DOI: 10.1007/s40264-016-0426-9] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Patient-centeredness has become an acknowledged hallmark of not only high-quality health care but also high-quality drug development. Biopharmaceutical companies are actively seeking to be more patient-centric in drug research and development by involving patients in identifying target disease conditions, participating in the design of, and recruitment for, clinical trials, and disseminating study results. Drug safety departments within the biopharmaceutical industry are at a similar inflection point. Rising rates of per capita prescription drug use underscore the importance of having robust pharmacovigilance systems in place to detect and assess adverse drug reactions (ADRs). At the same time, the practice of pharmacovigilance is being transformed by a host of recent regulatory guidances and related initiatives which emphasize the importance of the patient’s perspective in drug safety. Collectively, these initiatives impact the full range of activities that fall within the remit of pharmacovigilance, including ADR reporting, signal detection and evaluation, risk management, medication error assessment, benefit–risk assessment and risk communication. Examples include the fact that manufacturing authorization holders are now expected to monitor all digital sources under their control for potential reports of ADRs, and the emergence of new methods for collecting, analysing and reporting patient-generated ADR reports for signal detection and evaluation purposes. A drug safety department’s ability to transition successfully into a more patient-centric organization will depend on three defining attributes: (1) a patient-centered culture; (2) deployment of a framework to guide patient engagement activities; and (3) demonstrated proficiency in patient-centered competencies, including patient engagement, risk communication and patient preference assessment. Whether, and to what extent, drug safety departments embrace the new patient-centric imperative, and the methods and processes they implement to achieve this end effectively and efficiently, promise to become distinguishing factors in the highly competitive biopharmaceutical industry landscape.
Collapse
Affiliation(s)
- Meredith Y Smith
- Amgen Inc., One Amgen Center Drive, Thousand Oaks, CA, 91320, USA.
| | - Isma Benattia
- Amgen Inc., One Amgen Center Drive, Thousand Oaks, CA, 91320, USA
| |
Collapse
|