1
|
Golder S, O'Connor K, Wang Y, Klein A, Gonzalez Hernandez G. The Value of Social Media Analysis for Adverse Events Detection and Pharmacovigilance: Scoping Review. JMIR Public Health Surveill 2024; 10:e59167. [PMID: 39240684 DOI: 10.2196/59167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Revised: 05/03/2024] [Accepted: 05/30/2024] [Indexed: 09/07/2024] Open
Abstract
BACKGROUND Adverse drug events pose an enormous public health burden, leading to hospitalization, disability, and death. Even the adverse events (AEs) categorized as nonserious can severely impact on patient's quality of life, adherence, and persistence. Monitoring medication safety is challenging. Web-based patient reports on social media may be a useful supplementary source of real-world data. Despite the growth of sophisticated techniques for identifying AEs using social media data, a consensus has not been reached as to the value of social media in relation to more traditional data sources. OBJECTIVE This study aims to evaluate and characterize the utility of social media analysis in adverse drug event detection and pharmacovigilance as compared with other data sources (such as spontaneous reporting systems and the clinical literature). METHODS In this scoping review, we searched 11 bibliographical databases and Google Scholar, followed by handsearching and forward and backward citation searching. Each record was screened by 2 independent reviewers at both the title and abstract stage and the full-text screening stage. Studies were included if they used any type of social media (such as Twitter or patient forums) to detect AEs associated with any drug medication and compared the results ascertained from social media to any other data source. Study information was collated using a piloted data extraction sheet. Data were extracted on the AEs and drugs searched for and included; the methods used (such as machine learning); social media data source; volume of data analyzed; limitations of the methodology; availability of data and code; comparison data source and comparison methods; results, including the volume of AEs, and how the AEs found compared with other data sources in their seriousness, frequencies, and expectedness or novelty (new vs known knowledge); and conclusions. RESULTS Of the 6538 unique records screened, 73 publications representing 60 studies with a wide variety of extraction methods met our inclusion criteria. The most common social media platforms used were Twitter and online health forums. The most common comparator data source was spontaneous reporting systems, although other comparisons were also made, such as with scientific literature and product labels. Although similar patterns of AE reporting tended to be identified, the frequencies were lower in social media. Social media data were found to be useful in identifying new or unexpected AEs and in identifying AEs in a timelier manner. CONCLUSIONS There is a large body of research comparing AEs from social media to other sources. Most studies advocate the use of social media as an adjunct to traditional data sources. Some studies also indicate the value of social media in understanding patient perspectives such as the impact of AEs, which could be better explored. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) RR2-10.2196/47068.
Collapse
Affiliation(s)
- Su Golder
- University of York, York, United Kingdom
| | - Karen O'Connor
- University of Pennsylvannia, Philadelphia, PA, United States
| | - Yunwen Wang
- Cedars-Sinai Medical Center, Los Angeles, CA, United States
| | - Ari Klein
- University of Pennsylvannia, Philadelphia, PA, United States
| | | |
Collapse
|
2
|
Nishioka S, Asano M, Yada S, Aramaki E, Yajima H, Yanagisawa Y, Sayama K, Kizaki H, Hori S. Adverse event signal extraction from cancer patients' narratives focusing on impact on their daily-life activities. Sci Rep 2023; 13:15516. [PMID: 37726371 PMCID: PMC10509234 DOI: 10.1038/s41598-023-42496-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Accepted: 09/11/2023] [Indexed: 09/21/2023] Open
Abstract
Adverse event (AE) management is important to improve anti-cancer treatment outcomes, but it is known that some AE signals can be missed during clinical visits. In particular, AEs that affect patients' activities of daily living (ADL) need careful monitoring as they may require immediate medical intervention. This study aimed to build deep-learning (DL) models for extracting signals of AEs limiting ADL from patients' narratives. The data source was blog posts written in Japanese by breast cancer patients. After pre-processing and annotation for AE signals, three DL models (BERT, ELECTRA, and T5) were trained and tested in three different approaches for AE signal identification. The performances of the trained models were evaluated in terms of precision, recall, and F1 scores. From 2,272 blog posts, 191 and 702 articles were identified as describing AEs limiting ADL or not limiting ADL, respectively. Among tested DL modes and approaches, T5 showed the best F1 scores to identify articles with AE limiting ADL or all AE: 0.557 and 0.811, respectively. The most frequent AE signals were "pain or numbness", "fatigue" and "nausea". Our results suggest that this AE monitoring scheme focusing on patients' ADL has potential to reinforce current AE management provided by medical staff.
Collapse
Affiliation(s)
- Satoshi Nishioka
- Division of Drug Informatics, Keio University Faculty of Pharmacy, Tokyo, Japan
| | - Masaki Asano
- Division of Drug Informatics, Keio University Faculty of Pharmacy, Tokyo, Japan
| | - Shuntaro Yada
- Nara Institute of Science and Technology, Nara, Japan
| | - Eiji Aramaki
- Nara Institute of Science and Technology, Nara, Japan
| | | | - Yuki Yanagisawa
- Division of Drug Informatics, Keio University Faculty of Pharmacy, Tokyo, Japan
| | - Kyoko Sayama
- Division of Drug Informatics, Keio University Faculty of Pharmacy, Tokyo, Japan
| | - Hayato Kizaki
- Division of Drug Informatics, Keio University Faculty of Pharmacy, Tokyo, Japan
| | - Satoko Hori
- Division of Drug Informatics, Keio University Faculty of Pharmacy, Tokyo, Japan.
| |
Collapse
|
3
|
Nishioka S, Watanabe T, Asano M, Yamamoto T, Kawakami K, Yada S, Aramaki E, Yajima H, Kizaki H, Hori S. Identification of hand-foot syndrome from cancer patients' blog posts: BERT-based deep-learning approach to detect potential adverse drug reaction symptoms. PLoS One 2022; 17:e0267901. [PMID: 35507636 PMCID: PMC9067685 DOI: 10.1371/journal.pone.0267901] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 04/18/2022] [Indexed: 12/29/2022] Open
Abstract
Early detection and management of adverse drug reactions (ADRs) is crucial for improving patients' quality of life. Hand-foot syndrome (HFS) is one of the most problematic ADRs for cancer patients. Recently, an increasing number of patients post their daily experiences to internet community, for example in blogs, where potential ADR signals not captured through routine clinic visits can be described. Therefore, this study aimed to identify patients with potential ADRs, focusing on HFS, from internet blogs by using natural language processing (NLP) deep-learning methods. From 10,646 blog posts, written in Japanese by cancer patients, 149 HFS-positive sentences were extracted after pre-processing, annotation and scrutiny by a certified oncology pharmacist. The HFS-positive sentences described not only HFS typical expressions like "pain" or "spoon nail", but also patient-derived unique expressions like onomatopoeic ones. The dataset was divided at a 4 to 1 ratio and used to train and evaluate three NLP deep-learning models: long short-term memory (LSTM), bidirectional LSTM and bidirectional encoder representations from transformers (BERT). The BERT model gave the best performance with precision 0.63, recall 0.82 and f1 score 0.71 in the HFS user identification task. Our results demonstrate that this NLP deep-learning model can successfully identify patients with potential HFS from blog posts, where patients' real wordings on symptoms or impacts on their daily lives are described. Thus, it should be feasible to utilize patient-generated text data to improve ADR management for individual patients.
Collapse
Affiliation(s)
- Satoshi Nishioka
- Keio University Faculty of Pharmacy, Division of Drug Informatics, Tokyo, Japan
| | - Tomomi Watanabe
- Keio University Faculty of Pharmacy, Division of Drug Informatics, Tokyo, Japan
| | - Masaki Asano
- Keio University Faculty of Pharmacy, Division of Drug Informatics, Tokyo, Japan
| | - Tatsunori Yamamoto
- Keio University Faculty of Pharmacy, Division of Drug Informatics, Tokyo, Japan
| | - Kazuyoshi Kawakami
- Department of Pharmacy, Cancer Institute Hospital, Japanese Foundation for Cancer Research, Tokyo, Japan
| | - Shuntaro Yada
- Nara Institute of Science and Technology, Nara, Japan
| | - Eiji Aramaki
- Nara Institute of Science and Technology, Nara, Japan
| | | | - Hayato Kizaki
- Keio University Faculty of Pharmacy, Division of Drug Informatics, Tokyo, Japan
| | - Satoko Hori
- Keio University Faculty of Pharmacy, Division of Drug Informatics, Tokyo, Japan
| |
Collapse
|
4
|
Arquembourg J, Glaser P, Roblot F, Metzler I, Gallant-Dewavrin M, Mebarki A, Voillot P, Schück S, Lalaude O. Social Media Platforms Listening Study on Antibiotic Resistance: Quantitative and Qualitative Findings. (Preprint). JMIR Form Res 2022. [DOI: 10.2196/37160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
5
|
Voillot P, Riche B, Portafax M, Foulquié P, Gedik A, Barbarot S, Misery L, Héas S, Mebarki A, Texier N, Schück S. Social Media Platforms Listening Study on Atopic Dermatitis: Quantitative and Qualitative Findings. J Med Internet Res 2022; 24:e31140. [PMID: 35089160 PMCID: PMC8838596 DOI: 10.2196/31140] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 10/04/2021] [Accepted: 11/30/2021] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Atopic dermatitis (AD) is a chronic, pruritic, inflammatory disease that occurs most frequently in children but also affects many adults. Social media have become key tools for finding and disseminating medical information. OBJECTIVE The aims of this study were to identify the main themes of discussion, the difficulties encountered by patients with respect to AD, the impact of the pathology on quality of life (QoL; physical, psychological, social, or financial), and to study the perception of patients regarding their treatment. METHODS A retrospective study was carried out by collecting social media posts in French language written by internet users mentioning their experience with AD, their QoL, and their treatments. Messages related to AD discomfort posted between July 1, 2010, and October 23, 2020, were extracted from French-speaking publicly available online forums. Automatic and manual extractions were implemented to create a general corpus and 2 subcorpuses depending on the level of control of the disease. RESULTS A total of 33,115 messages associated with AD were included in the analysis corpus after extraction and cleaning. These messages were posted by 15,857 separate web users, most of them being women younger than 40 years. Tips to manage AD and everyday hygiene/treatments were among the most discussed topics for controlled AD subcorpus, while baby-related topics and therapeutic failure were among the most discussed topics for insufficiently controlled AD subcorpus. QoL was discussed in both subcorpuses with a higher proportion in the controlled AD subcorpus. Treatments and their perception were also discussed by web users. CONCLUSIONS More than just emotional or peer support, patients with AD turn to online forums to discuss their health. Our findings show the need for an intersection between social media and health care and the importance of developing new approaches such as the Atopic Dermatitis Control Tool, which is a patient-related disease severity assessment tool focused on patients with AD.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Laurent Misery
- Centre Hospitalier Universitaire de Brest, Brest, France
| | | | | | | | | |
Collapse
|
6
|
Renner S, Marty T, Khadhar M, Foulquié P, Voillot P, Mebarki A, Montagni I, Texier N, Schück S. A New Method to Extract Health-Related Quality of Life Data From Social Media Testimonies: Algorithm Development and Validation. J Med Internet Res 2022; 24:e31528. [PMID: 35089152 PMCID: PMC8838601 DOI: 10.2196/31528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Revised: 10/05/2021] [Accepted: 10/29/2021] [Indexed: 11/13/2022] Open
Abstract
Background Monitoring social media has been shown to be a useful means to capture patients’ opinions and feelings about medical issues, ranging from diseases to treatments. Health-related quality of life (HRQoL) is a useful indicator of overall patients’ health, which can be captured online. Objective This study aimed to describe a social media listening algorithm able to detect the impact of diseases or treatments on specific dimensions of HRQoL based on posts written by patients in social media and forums. Methods Using a web crawler, 19 forums in France were harvested, and messages related to patients’ experience with disease or treatment were specifically collected. The SF-36 (Short Form Health Survey) and EQ-5D (Euro Quality of Life 5 Dimensions) HRQoL surveys were mixed and adapted for a tailored social media listening system. This was carried out to better capture the variety of expression on social media, resulting in 5 dimensions of the HRQoL, which are physical, psychological, activity-based, social, and financial. Models were trained using cross-validation and hyperparameter optimization. Oversampling was used to increase the infrequent dimension: after annotation, SMOTE (synthetic minority oversampling technique) was used to balance the proportions of the dimensions among messages. Results The training set was composed of 1399 messages, randomly taken from a batch of 20,000 health-related messages coming from forums. The algorithm was able to detect a general impact on HRQoL (sensitivity of 0.83 and specificity of 0.74), a physical impact (0.67 and 0.76), a psychic impact (0.82 and 0.60), an activity-related impact (0.73 and 0.78), a relational impact (0.73 and 0.70), and a financial impact (0.79 and 0.74). Conclusions The development of an innovative method to extract health data from social media as real time assessment of patients’ HRQoL is useful to a patient-centered medical care. As a source of real-world data, social media provide a complementary point of view to understand patients’ concerns and unmet needs, as well as shedding light on how diseases and treatments can be a burden in their daily lives.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Ilaria Montagni
- Bordeaux Population Health Research Center, UMR 1219, Bordeaux University, Inserm, Bordeaux, France
| | | | | |
Collapse
|
7
|
Schück S, Roustamal A, Gedik A, Voillot P, Foulquié P, Penfornis C, Job B. Assessing Patient Perceptions and Experiences of Paracetamol in France: Infodemiology Study Using Social Media Data Mining. J Med Internet Res 2021; 23:e25049. [PMID: 34255645 PMCID: PMC8314157 DOI: 10.2196/25049] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Revised: 03/24/2021] [Accepted: 04/25/2021] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND Individuals frequently turning to social media to discuss medical conditions and medication, sharing their experiences and information and asking questions among themselves. These online discussions can provide valuable insights into individual perceptions of medical treatment, and increasingly, studies are focusing on the potential use of this information to improve health care management. OBJECTIVE The objective of this infodemiology study was to identify social media posts mentioning paracetamol-containing products to develop a better understanding of patients' opinions and perceptions of the drug. METHODS Posts between January 2003 and March 2019 containing at least one mention of paracetamol were extracted from 18 French forums in May 2019 with the use of the Detec't (Kap Code) web crawler. Posts were then analyzed using the automated Detec't tool, which uses machine learning and text mining methods to inspect social media posts and extract relevant content. Posts were classified into groups: Paracetamol Only, Paracetamol and Opioids, Paracetamol and Others, and the Aggregate group. RESULTS Overall, 44,283 posts were analyzed from 20,883 different users. Post volume over the study period showed a peak in activity between 2009 and 2012, as well as a spike in 2017 in the Aggregate group. The number of posts tended to be higher during winter each year. Posts were made predominantly by women (14,897/20,883, 71.34%), with 12.00% (2507/20,883) made by men and 16.67% (3479/20,883) by individuals of unknown gender. The mean age of web users was 39 (SD 19) years. In the Aggregate group, pain was the most common medical concept discussed (22,257/37,863, 58.78%), and paracetamol risk was the most common discussion topic, addressed in 20.36% (8902/43,725) of posts. Doliprane was the most common medication mentioned (14,058/44,283, 31.74%) within the Aggregate group, and tramadol was the most commonly mentioned drug in combination with paracetamol in the Aggregate group (1038/19,587, 5.30%). The most common unapproved indication mentioned within the Paracetamol Only group was fatigue (190/616, with 16.32% positive for an unapproved indication), with reference to dependence made by 1.61% (136/8470) of the web users, accounting for 1.33% (171/12,843) of the posts in the Paracetamol Only group. Dependence mentions in the Paracetamol and Opioids group were provided by 6.94% (248/3576) of web users, accounting for 5.44% (342/6281) of total posts. Reference to overdose was made by 245 web users across 291 posts within the Paracetamol Only group. The most common potential adverse event detected was nausea (306/12843, 2.38%) within the Paracetamol Only group. CONCLUSIONS The use of social media mining with the Detec't tool provided valuable information on the perceptions and understanding of the web users, highlighting areas where providing more information for the general public on paracetamol, as well as other medications, may be of benefit.
Collapse
|
8
|
杨 羽, 王 胜, 詹 思. [Utilizing social media data in post-market safety surveillance]. BEIJING DA XUE XUE BAO. YI XUE BAN = JOURNAL OF PEKING UNIVERSITY. HEALTH SCIENCES 2021; 53:623-627. [PMID: 34145872 PMCID: PMC8220064 DOI: 10.19723/j.issn.1671-167x.2021.03.031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Indexed: 06/12/2023]
Abstract
Post-marketing surveillance is the principal means to ensure drug use safety. The spontaneous report is the essential method of post-marketing surveillance for drug safety. Often, most spontaneous reports come from medical staff and sometimes come from patients who use the drug. The posts published by individuals on social media platforms that contain drugs and related adverse reaction content have gradually been seen as a new data source similar to spontaneous reports from drug users in recent years. Those user-generated posts potentially provide researchers and regulators with new opportunities to conduct post-marketing surveillance for drug safety from patients' perspectives mostly rather than medical professionals and can afford the possibility theoretically to discover drug-related safety issues earlier than traditional methods. Social media data as a new data source for safety signal detection and signal reinforcement have the unique advantages, such as population coverage, type of drugs, type of adverse reactions, data timeliness and quantity. Most of the social media data used in post-marketing surveillance research for drug safety are still text data in English, and even multiple languages are used by different people worldwide on several social media platforms. Unfortunately, there is still a controversy in the academic circles whether social media data can be used as reliable data sources for routine post-marketing surveillance for drug safety. A couple of obstacles of data, methods and ethics must be overcome before leveraging social media data for post-marketing surveillance. The number of Chinese social media users is large, and the social media data in the Chinese language is rapidly snowballing, which can be employed as the potential data source for post-marketing surveillance for drug safety. However, due to the Chinese language's specific characteristics, the text's diversity is different from the English text, and there is not enough accepted corpus in medical scenarios. Besides, the lack of domestic laws and regulations on privacy and security protection of social media data poses more challenges for applying Chinese social media data for post-market surveillance. The significance of social media data to post-marketing surveillance for drug safety is undoubtedly significant. It will be an essential development direction for future research to overcome the challenges of using social media data by developing new technologies and establishing new mechanisms.
Collapse
Affiliation(s)
- 羽 杨
- 北京大学健康医疗大数据国家研究院, 北京 100191National Institute of Health Data Science, Peking University, Beijing 100191, China
| | - 胜锋 王
- 北京大学公共卫生学院流行病学与卫生统计学系, 北京 100191Department of Epidemiology and Biostatistics, Peking University School of Public Health, Beijing 100191, Chian
| | - 思延 詹
- 北京大学公共卫生学院流行病学与卫生统计学系, 北京 100191Department of Epidemiology and Biostatistics, Peking University School of Public Health, Beijing 100191, Chian
| |
Collapse
|
9
|
Schäfer F, Faviez C, Voillot P, Foulquié P, Najm M, Jeanne JF, Fagherazzi G, Schück S, Le Nevé B. Mapping and Modeling of Discussions Related to Gastrointestinal Discomfort in French-Speaking Online Forums: Results of a 15-Year Retrospective Infodemiology Study. J Med Internet Res 2020; 22:e17247. [PMID: 33141087 PMCID: PMC7671840 DOI: 10.2196/17247] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2019] [Revised: 04/30/2020] [Accepted: 06/25/2020] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Gastrointestinal (GI) discomfort is prevalent and known to be associated with impaired quality of life. Real-world information on factors of GI discomfort and solutions used by people is, however, limited. Social media, including online forums, have been considered a new source of information to examine the health of populations in real-life settings. OBJECTIVE The aims of this retrospective infodemiology study are to identify discussion topics, characterize users, and identify perceived determinants of GI discomfort in web-based messages posted by users of French social media. METHODS Messages related to GI discomfort posted between January 2003 and August 2018 were extracted from 14 French-speaking general and specialized publicly available online forums. Extracted messages were cleaned and deidentified. Relevant medical concepts were determined on the basis of the Medical Dictionary for Regulatory Activities and vernacular terms. The identification of discussion topics was carried out by using a correlated topic model on the basis of the latent Dirichlet allocation. A nonsupervised clustering algorithm was applied to cluster forum users according to the reported symptoms of GI discomfort, discussion topics, and activity on online forums. Users' age and gender were determined by linear regression and application of a support vector machine, respectively, to characterize the identified clusters according to demographic parameters. Perceived factors of GI discomfort were classified by a combined method on the basis of syntactic analysis to identify messages with causality terms and a second topic modeling in a relevant segment of phrases. RESULTS A total of 198,866 messages associated with GI discomfort were included in the analysis corpus after extraction and cleaning. These messages were posted by 36,989 separate web users, most of them being women younger than 40 years. Everyday life, diet, digestion, abdominal pain, impact on the quality of life, and tips to manage stress were among the most discussed topics. Segmentation of users identified 5 clusters corresponding to chronic and acute GI concerns. Diet topic was associated with each cluster, and stress was strongly associated with abdominal pain. Psychological factors, food, and allergens were perceived as the main causes of GI discomfort by web users. CONCLUSIONS GI discomfort is actively discussed by web users. This study reveals a complex relationship between food, stress, and GI discomfort. Our approach has shown that identifying web-based discussion topics associated with GI discomfort and its perceived factors is feasible and can serve as a complementary source of real-world evidence for caregivers.
Collapse
Affiliation(s)
- Florent Schäfer
- Innovation Science and Nutrition, Danone Nutricia Research, Palaiseau, France
| | | | | | | | | | | | - Guy Fagherazzi
- Deep Digital Phenotyping Research Unit, Department of Population Health, Luxembourg Institute of Health, Strassen, Luxembourg.,Center of Research in Epidemiology and Population Health, UMR 1018 Inserm, Institut Gustave Roussy, Paris-Sud Paris-Saclay University, Villejuif, France
| | | | - Boris Le Nevé
- Innovation Science and Nutrition, Danone Nutricia Research, Palaiseau, France
| |
Collapse
|
10
|
Cotté FE, Voillot P, Bennett B, Falissard B, Tzourio C, Foulquié P, Gaudin AF, Lemasson H, Grumberg V, McDonald L, Faviez C, Schück S. Exploring the Health-Related Quality of Life of Patients Treated With Immune Checkpoint Inhibitors: Social Media Study. J Med Internet Res 2020; 22:e19694. [PMID: 32915159 PMCID: PMC7519426 DOI: 10.2196/19694] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Revised: 07/10/2020] [Accepted: 07/26/2020] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Immune checkpoint inhibitors (ICIs) are increasingly used to treat several types of tumors. Impact of this emerging therapy on patients' health-related quality of life (HRQoL) is usually collected in clinical trials through standard questionnaires. However, this might not fully reflect HRQoL of patients under real-world conditions. In parallel, users' narratives from social media represent a potential new source of research concerning HRQoL. OBJECTIVE The aim of this study is to assess and compare coverage of ICI-treated patients' HRQoL domains and subdomains in standard questionnaires from clinical trials and in real-world setting from social media posts. METHODS A retrospective study was carried out by collecting social media posts in French language written by internet users mentioning their experiences with ICIs between January 2011 and August 2018. Automatic and manual extractions were implemented to create a corpus where domains and subdomains of HRQoL were classified. These annotations were compared with domains covered by 2 standard HRQoL questionnaires, the EORTC QLQ-C30 and the FACT-G. RESULTS We identified 150 users who described their own experience with ICI (89/150, 59.3%) or that of their relative (61/150, 40.7%), with 137 users (91.3%) reporting at least one HRQoL domain in their social media posts. A total of 8 domains and 42 subdomains of HRQoL were identified: Global health (1 subdomain; 115 patients), Symptoms (13; 76), Emotional state (10; 49), Role (7; 22), Physical activity (4; 13), Professional situation (3; 9), Cognitive state (2; 2), and Social state (2; 2). The QLQ-C30 showed a wider global coverage of social media HRQoL subdomains than the FACT-G, 45% (19/42) and 29% (12/42), respectively. For both QLQ-C30 and FACT-G questionnaires, coverage rates were particularly suboptimal for Symptoms (68/123, 55.3% and 72/123, 58.5%, respectively), Emotional state (7/49, 14% and 24/49, 49%, respectively), and Role (17/22, 77% and 15/22, 68%, respectively). CONCLUSIONS Many patients with cancer are using social media to share their experiences with immunotherapy. Collecting and analyzing their spontaneous narratives are helpful to capture and understand their HRQoL in real-world setting. New measures of HRQoL are needed to provide more in-depth evaluation of Symptoms, Emotional state, and Role among patients with cancer treated with immunotherapy.
Collapse
Affiliation(s)
| | | | | | - Bruno Falissard
- Paris-Sud University, Paris, France.,Paris-Descartes Universitiy, Paris, France.,AP-HP, Paris, France.,INSERM U1178, Paris, France
| | | | | | | | | | | | | | | | | |
Collapse
|
11
|
|
12
|
Gavrielov-Yusim N, Kürzinger ML, Nishikawa C, Pan C, Pouget J, Epstein LB, Golant Y, Tcherny-Lessenot S, Lin S, Hamelin B, Juhaeri J. Comparison of text processing methods in social media-based signal detection. Pharmacoepidemiol Drug Saf 2019; 28:1309-1317. [PMID: 31392844 DOI: 10.1002/pds.4857] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2018] [Revised: 06/12/2019] [Accepted: 06/14/2019] [Indexed: 11/08/2022]
Abstract
PURPOSE Adverse event (AE) identification in social media (SM) can be performed using various types of natural language processing (NLP) and machine learning (ML). These methods can be categorized by complexity and precision level. Co-occurrence-based ML methods are rather basic, as they identify simultaneous appearance of drugs and clinical events in a single post. In contrast, statistical learning methods involve more complex NLP and identify drugs, events, and associations between them. We aimed to compare the ability of co-occurrence and NLP to identify AEs and signals of disproportionate reporting (SDR) in patient-generated SM. We also examined the performance of lift in SM-based signal detection (SD). METHODS Our examination was performed in a corpus of SM posts crawled from open online patient forums and communities, using the spontaneously reported VigiBase data as reference data set. RESULTS We found that co-occurrence and NLP produce AEs, which are 57% and 93% consistent with VigiBase AEs, respectively. Among the SDRs identified both in SM and in VigiBase, up to 55.3% were identified earlier in co-occurrence, and up to 32.1% were identified earlier in NLP-processed SM. Using lift in SM SD provided performance similar to frequentist methods, both in co-occurrence and in NLP-processed AEs. CONCLUSION Our results indicate that using SM as a data source complementary to traditional pharmacovigilance sources should be considered further. Various levels of SM processing may be considered, depending on the preferred policies and tolerance for false-positive to false-negative balance in routine pharmacovigilance processes.
Collapse
Affiliation(s)
| | | | - Chihiro Nishikawa
- Epidemiology and Benefit Risk Evaluation, Sanofi, Chilly-Mazarin, France
| | - Chunshen Pan
- Epidemiology and Benefit Risk Evaluation, Sanofi, Bridgewater, NJ, USA
| | - Julie Pouget
- Information Technology and Solutions, R&D CMO - SC Real World Evidence, Sanofi, Lyon, France
| | | | | | | | - Stephen Lin
- Global Pharmacovigilance, Sanofi, Bridgewater, NJ, USA
| | | | - Juhaeri Juhaeri
- Epidemiology and Benefit Risk Evaluation, Sanofi, Bridgewater, NJ, USA
| |
Collapse
|
13
|
Bate A, Hornbuckle K, Juhaeri J, Motsko SP, Reynolds RF. Hypothesis-free signal detection in healthcare databases: finding its value for pharmacovigilance. Ther Adv Drug Saf 2019; 10:2042098619864744. [PMID: 31428307 PMCID: PMC6683315 DOI: 10.1177/2042098619864744] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Affiliation(s)
- Andrew Bate
- Division of Translational Medicine, Department of Medicine, NYU School of Medicine, 462 1st Avenue, NY10016, New York, USA
| | - Ken Hornbuckle
- Global Patient Safety, Eli Lilly and Company, Indianapolis, IN, USA
| | - Juhaeri Juhaeri
- Juhaeri Juhaeri, Medical Evidence Generation, Sanofi US, Bridgewater, NJ, USA
| | | | - Robert F. Reynolds
- Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA
| |
Collapse
|
14
|
Caster O, Dietrich J, Kürzinger ML, Lerch M, Maskell S, Norén GN, Tcherny-Lessenot S, Vroman B, Wisniewski A, van Stekelenborg J. Assessment of the Utility of Social Media for Broad-Ranging Statistical Signal Detection in Pharmacovigilance: Results from the WEB-RADR Project. Drug Saf 2018; 41:1355-1369. [PMID: 30043385 PMCID: PMC6223695 DOI: 10.1007/s40264-018-0699-2] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
INTRODUCTION AND OBJECTIVE Social media has been proposed as a possibly useful data source for pharmacovigilance signal detection. This study primarily aimed to evaluate the performance of established statistical signal detection algorithms in Twitter/Facebook for a broad range of drugs and adverse events. METHODS Performance was assessed using a reference set by Harpaz et al., consisting of 62 US Food and Drug Administration labelling changes, and an internal WEB-RADR reference set consisting of 200 validated safety signals. In total, 75 drugs were studied. Twitter/Facebook posts were retrieved for the period March 2012 to March 2015, and drugs/events were extracted from the posts. We retrieved 4.3 million and 2.0 million posts for the WEB-RADR and Harpaz drugs, respectively. Individual case reports were extracted from VigiBase for the same period. Disproportionality algorithms based on the Information Component or the Proportional Reporting Ratio and crude post/report counting were applied in Twitter/Facebook and VigiBase. Receiver operating characteristic curves were generated, and the relative timing of alerting was analysed. RESULTS Across all algorithms, the area under the receiver operating characteristic curve for Twitter/Facebook varied between 0.47 and 0.53 for the WEB-RADR reference set and between 0.48 and 0.53 for the Harpaz reference set. For VigiBase, the ranges were 0.64-0.69 and 0.55-0.67, respectively. In Twitter/Facebook, at best, 31 (16%) and four (6%) positive controls were detected prior to their index dates in the WEB-RADR and Harpaz references, respectively. In VigiBase, the corresponding numbers were 66 (33%) and 17 (27%). CONCLUSIONS Our results clearly suggest that broad-ranging statistical signal detection in Twitter and Facebook, using currently available methods for adverse event recognition, performs poorly and cannot be recommended at the expense of other pharmacovigilance activities.
Collapse
Affiliation(s)
- Ola Caster
- Uppsala Monitoring Centre, Box 1051, Uppsala, 75140, Sweden.
| | | | | | | | | | - G Niklas Norén
- Uppsala Monitoring Centre, Box 1051, Uppsala, 75140, Sweden
| | | | | | | | | |
Collapse
|