1
|
Heinz MV, Yom-Tov E, Mackin DM, Matsumura R, Jacobson NC. A large-scale observational comparison of antidepressants and their effects. J Psychiatr Res 2024; 178:219-224. [PMID: 39163659 PMCID: PMC11398883 DOI: 10.1016/j.jpsychires.2024.08.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Revised: 08/01/2024] [Accepted: 08/02/2024] [Indexed: 08/22/2024]
Abstract
BACKGROUND Selective Serotonin Reuptake Inhibitors (SSRIs) represent a diverse class of medications widely prescribed for depression and anxiety. Despite their common use, there is an absence of large-scale, real-world evidence capturing the heterogeneity in their effects on individuals. This study addresses this gap by utilizing naturalistic search data to explore the varied impact of six different SSRIs on user behavior. METHODS The study sample included ∼508 thousand Bing users with searches for one of six SSRIs (citalopram, escitalopram, fluoxetine, fluvoxamine, paroxetine, sertraline) from April-December 2022, comprising 510 million queries. Cox proportional hazard models were employed to examine 30 topics (e.g., shopping, tourism, health) and 195 health symptoms (e.g., anxiety, weight gain, impotence), using each SSRI as a reference. We assessed the relative hazard ratios between drugs and, where feasible, ranked the SSRIs based on their observed effects. We used Cox proportional hazard models in order to account for both the likelihood of users searching for a particular topic or symptom and the associated time to that search. The temporal aspect aided in distinguishing between potential symptoms of the disorder, short-term medication side effects, and later appearing side effects. RESULTS Differences were found in search behaviors associated with each SSRI. E.g., fluvoxamine was associated with a significantly higher likelihood of searching weight gain compared to all other SSRIs (HRs 1.85-2.93). Searches following citalopram were associated with significantly higher rates of later impotence queries compared to all other SSRIs (HRs 5.11-7.76), except fluvoxamine. Fluvoxamine was associated with a significantly higher rate of health related searches than all other SSRIs (HRs 2.11-2.36). CONCLUSIONS Our study reveals new insights into the varying SSRI impacts, suggesting distinct symptom profiles. This novel use of large-scale, naturalistic search data contributes to pharmacovigilance efforts, enhancing our understanding of intra-class variation among SSRIs, potentially uncovering previously unidentified drug effects.
Collapse
Affiliation(s)
- Michael V Heinz
- Center for Technology and Behavioral Health, Geisel School of Medicine, Dartmouth College, Lebanon, NH, United States; Department of Psychiatry, Geisel School of Medicine, Dartmouth College, Hanover, NH, United States; Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Lebanon, NH, United States.
| | - Elad Yom-Tov
- Microsoft Research Israel, Herzeliya, Israel; Faculty of Industrial Engineering and Management, Technion - Israel Institute of Technology, Haifa, Israel
| | - Daniel M Mackin
- Center for Technology and Behavioral Health, Geisel School of Medicine, Dartmouth College, Lebanon, NH, United States; Department of Psychiatry, Geisel School of Medicine, Dartmouth College, Hanover, NH, United States; Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Lebanon, NH, United States
| | - Rina Matsumura
- Center for Technology and Behavioral Health, Geisel School of Medicine, Dartmouth College, Lebanon, NH, United States
| | - Nicholas C Jacobson
- Center for Technology and Behavioral Health, Geisel School of Medicine, Dartmouth College, Lebanon, NH, United States; Quantitative Biomedical Sciences Program, Dartmouth College, Hanover, NH, United States; Department of Psychiatry, Geisel School of Medicine, Dartmouth College, Hanover, NH, United States; Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Lebanon, NH, United States
| |
Collapse
|
2
|
Dhiman A, Yom-Tov E, Pellis L, Edelstein M, Pebody R, Hayward A, House T, Finnie T, Guzman D, Lampos V, Cox IJ. Estimating the household secondary attack rate and serial interval of COVID-19 using social media. NPJ Digit Med 2024; 7:194. [PMID: 39033238 PMCID: PMC11271293 DOI: 10.1038/s41746-024-01160-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 06/10/2024] [Indexed: 07/23/2024] Open
Abstract
We propose a method to estimate the household secondary attack rate (hSAR) of COVID-19 in the United Kingdom based on activity on the social media platform X, formerly known as Twitter. Conventional methods of hSAR estimation are resource intensive, requiring regular contact tracing of COVID-19 cases. Our proposed framework provides a complementary method that does not rely on conventional contact tracing or laboratory involvement, including the collection, processing, and analysis of biological samples. We use a text classifier to identify reports of people tweeting about themselves and/or members of their household having COVID-19 infections. A probabilistic analysis is then performed to estimate the hSAR based on the number of self or household, and self and household tweets of COVID-19 infection. The analysis includes adjustments for a reluctance of Twitter users to tweet about household members, and the possibility that the secondary infection was not acquired within the household. Experimental results for the UK, both monthly and weekly, are reported for the period from January 2020 to February 2022. Our results agree with previously reported hSAR estimates, varying with the primary variants of concern, e.g. delta and omicron. The serial interval (SI) is based on the time between the two tweets that indicate a primary and secondary infection. Experimental results, though larger than the consensus, are qualitatively similar. The estimation of hSAR and SI using social media data constitutes a new tool that may help in characterizing, forecasting and managing outbreaks and pandemics in a faster, affordable, and more efficient manner.
Collapse
Affiliation(s)
- Aarzoo Dhiman
- Department of Computer Science, University College London, London, UK.
- Centre of Excellence for Data Science, AI and Modelling, University of Hull, Hull, UK.
| | - Elad Yom-Tov
- Microsoft Research, Herzliya, Israel
- Department of Computer Science, Bar Ilan University, Ramat Gan, Israel
| | - Lorenzo Pellis
- Department of Mathematics, University of Manchester, Manchester, UK
| | | | - Richard Pebody
- UK Health Security Agency, 61 Collingdate Avenue, NW9 5EQ, London, UK
| | - Andrew Hayward
- UCL Collaborative Centre for Inclusion Health, UCL, London, UK
| | - Thomas House
- Department of Mathematics, University of Manchester, Manchester, UK
| | - Thomas Finnie
- UK Health Security Agency, 61 Collingdate Avenue, NW9 5EQ, London, UK
| | - David Guzman
- Department of Computer Science, University College London, London, UK
| | - Vasileios Lampos
- Department of Computer Science, University College London, London, UK.
| | - Ingemar J Cox
- Department of Computer Science, University College London, London, UK.
- Department of Computer Science, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
3
|
Keller R, Spanu A, Puhan MA, Flahault A, Lovis C, Mütsch M, Beau-Lejdstrom R. Social media and internet search data to inform drug utilization: A systematic scoping review. Front Digit Health 2023; 5:1074961. [PMID: 37021064 PMCID: PMC10067924 DOI: 10.3389/fdgth.2023.1074961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 02/27/2023] [Indexed: 04/07/2023] Open
Abstract
Introduction Drug utilization is currently assessed through traditional data sources such as big electronic medical records (EMRs) databases, surveys, and medication sales. Social media and internet data have been reported to provide more accessible and more timely access to medications' utilization. Objective This review aims at providing evidence comparing web data on drug utilization to other sources before the COVID-19 pandemic. Methods We searched Medline, EMBASE, Web of Science, and Scopus until November 25th, 2019, using a predefined search strategy. Two independent reviewers conducted screening and data extraction. Results Of 6,563 (64%) deduplicated publications retrieved, 14 (0.2%) were included. All studies showed positive associations between drug utilization information from web and comparison data using very different methods. A total of nine (64%) studies found positive linear correlations in drug utilization between web and comparison data. Five studies reported association using other methods: One study reported similar drug popularity rankings using both data sources. Two studies developed prediction models for future drug consumption, including both web and comparison data, and two studies conducted ecological analyses but did not quantitatively compare data sources. According to the STROBE, RECORD, and RECORD-PE checklists, overall reporting quality was mediocre. Many items were left blank as they were out of scope for the type of study investigated. Conclusion Our results demonstrate the potential of web data for assessing drug utilization, although the field is still in a nascent period of investigation. Ultimately, social media and internet search data could be used to get a quick preliminary quantification of drug use in real time. Additional studies on the topic should use more standardized methodologies on different sets of drugs in order to confirm these findings. In addition, currently available checklists for study quality of reporting would need to be adapted to these new sources of scientific information.
Collapse
Affiliation(s)
- Roman Keller
- Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Zurich, Switzerland
- Future Health Technologies, Singapore-ETH Centre, Campus for Research Excellence and Technological Enterprise (CREATE), Singapore, Singapore
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore
- Correspondence: Roman Keller
| | - Alessandra Spanu
- Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Zurich, Switzerland
| | - Milo Alan Puhan
- Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Zurich, Switzerland
| | - Antoine Flahault
- Institute of Global Health, University of Geneva, Geneva, Switzerland
| | - Christian Lovis
- Division of Medical Information Sciences, University Hospitals of Geneva, Geneva, Switzerland
- Department of Radiology and Medical Informatics, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Margot Mütsch
- Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Zurich, Switzerland
| | - Raphaelle Beau-Lejdstrom
- Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Zurich, Switzerland
- Institute of Global Health, University of Geneva, Geneva, Switzerland
| |
Collapse
|
4
|
Déguilhem A, Malaab J, Talmatkadi M, Renner S, Foulquié P, Fagherazzi G, Loussikian P, Marty T, Mebarki A, Texier N, Schuck S. Identifying Profiles and Symptoms of Patients With Long COVID in France: Data Mining Infodemiology Study Based on Social Media. JMIR INFODEMIOLOGY 2022; 2:e39849. [PMID: 36447795 PMCID: PMC9685517 DOI: 10.2196/39849] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 09/19/2022] [Accepted: 10/01/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND Long COVID-a condition with persistent symptoms post COVID-19 infection-is the first illness arising from social media. In France, the French hashtag #ApresJ20 described symptoms persisting longer than 20 days after contracting COVID-19. Faced with a lack of recognition from medical and official entities, patients formed communities on social media and described their symptoms as long-lasting, fluctuating, and multisystemic. While many studies on long COVID relied on traditional research methods with lengthy processes, social media offers a foundation for large-scale studies with a fast-flowing outburst of data. OBJECTIVE We aimed to identify and analyze Long Haulers' main reported symptoms, symptom co-occurrences, topics of discussion, difficulties encountered, and patient profiles. METHODS Data were extracted based on a list of pertinent keywords from public sites (eg, Twitter) and health-related forums (eg, Doctissimo). Reported symptoms were identified via the MedDRA dictionary, displayed per the volume of posts mentioning them, and aggregated at the user level. Associations were assessed by computing co-occurrences in users' messages, as pairs of preferred terms. Discussion topics were analyzed using the Biterm Topic Modeling; difficulties and unmet needs were explored manually. To identify patient profiles in relation to their symptoms, each preferred term's total was used to create user-level hierarchal clusters. RESULTS Between January 1, 2020, and August 10, 2021, overall, 15,364 messages were identified as originating from 6494 patients of long COVID or their caregivers. Our analyses revealed 3 major symptom co-occurrences: asthenia-dyspnea (102/289, 35.3%), asthenia-anxiety (65/289, 22.5%), and asthenia-headaches (50/289, 17.3%). The main reported difficulties were symptom management (150/424, 35.4% of messages), psychological impact (64/424,15.1%), significant pain (51/424, 12.0%), deterioration in general well-being (52/424, 12.3%), and impact on daily and professional life (40/424, 9.4% and 34/424, 8.0% of messages, respectively). We identified 3 profiles of patients in relation to their symptoms: profile A (n=406 patients) reported exclusively an asthenia symptom; profile B (n=129) expressed anxiety (n=129, 100%), asthenia (n=28, 21.7%), dyspnea (n=15, 11.6%), and ageusia (n=3, 2.3%); and profile C (n=141) described dyspnea (n=141, 100%), and asthenia (n=45, 31.9%). Approximately 49.1% of users (79/161) continued expressing symptoms after more than 3 months post infection, and 20.5% (33/161) after 1 year. CONCLUSIONS Long COVID is a lingering condition that affects people worldwide, physically and psychologically. It impacts Long Haulers' quality of life, everyday tasks, and professional activities. Social media played an undeniable role in raising and delivering Long Haulers' voices and can potentially rapidly provide large volumes of valuable patient-reported information. Since long COVID was a self-titled condition by patients themselves via social media, it is imperative to continuously include their perspectives in related research. Our results can help design patient-centric instruments to be further used in clinical practice to better capture meaningful dimensions of long COVID.
Collapse
Affiliation(s)
| | | | | | | | | | - Guy Fagherazzi
- Deep Digital Phenotyping Research Unit, Department of Precision Health, Luxembourg Institute of Health Strassen Luxembourg
| | | | | | | | | | | |
Collapse
|
5
|
Abroms LC, Yom-Tov E. The Role of Information Boxes in Search Engine Results for Symptom Searches: Analysis of Archival Data. JMIR INFODEMIOLOGY 2022; 2:e37286. [PMID: 37113445 PMCID: PMC9987180 DOI: 10.2196/37286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 07/12/2022] [Accepted: 08/20/2022] [Indexed: 04/29/2023]
Abstract
Background Search engines provide health information boxes as part of search results to address information gaps and misinformation for commonly searched symptoms. Few prior studies have sought to understand how individuals who are seeking information about health symptoms navigate different types of page elements on search engine results pages, including health information boxes. Objective Using real-world search engine data, this study sought to investigate how users searching for common health-related symptoms with Bing interacted with health information boxes (info boxes) and other page elements. Methods A sample of searches (N=28,552 unique searches) was compiled for the 17 most common medical symptoms queried on Microsoft Bing by users in the United States between September and November 2019. The association between the page elements that users saw, their characteristics, and the time spent on elements or clicks was investigated using linear and logistic regression. Results The number of searches ranged by symptom type from 55 searches for cramps to 7459 searches for anxiety. Users searching for common health-related symptoms saw pages with standard web results (n=24,034, 84%), itemized web results (n=23,354, 82%), ads (n=13,171, 46%), and info boxes (n=18,215, 64%). Users spent on average 22 (SD 26) seconds on the search engine results page. Users who saw all page elements spent 25% (7.1 s) of their time on the info box, 23% (6.1 s) on standard web results, 20% (5.7 s) on ads, and 10% (10 s) on itemized web results, with significantly more time on the info box compared to other elements and the least amount of time on itemized web results. Info box characteristics such as reading ease and appearance of related conditions were associated with longer time on the info box. Although none of the info box characteristics were associated with clicks on standard web results, info box characteristics such as reading ease and related searches were negatively correlated with clicks on ads. Conclusions Info boxes were attended most by users compared with other page elements, and their characteristics may influence future web searching. Future studies are needed that further explore the utility of info boxes and their influence on real-world health-seeking behaviors.
Collapse
Affiliation(s)
- Lorien C Abroms
- Milken Institute School of Public Health George Washington University Washington, DC United States
| | | |
Collapse
|
6
|
Shiju A, He Z. Classifying Drug Ratings Using User Reviews with Transformer-Based Language Models. IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS. IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS 2022; 2022:163-169. [PMID: 36518748 PMCID: PMC9744636 DOI: 10.1109/ichi54592.2022.00035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Drug review websites such as Drugs.com provide users' textual reviews and numeric ratings of drugs. These reviews along with the ratings are used for the consumers for choosing a drug. However, the numeric ratings may not always be consistent with text reviews and purely relying on the rating score for finding positive/negative reviews may not be reliable. Automatic classification of user ratings based on textual review can create a more reliable rating for drugs. In this project, we built classification models to classify drug review ratings using textual reviews with traditional machine learning and deep learning models. Traditional machine learning models including Random Forest and Naive Bayesian classifiers were built using TF-IDF features as input. Also, transformer-based neural network models including BERT, Bio_ClinicalBERT, RoBERTa, XLNet, ELECTRA, and ALBERT were built using the raw text as input. Overall, Bio_ClinicalBERT model outperformed the other models with an overall accuracy of 87%. We further identified concepts of the Unified Medical Language System (UMLS) from the postings and analyzed their semantic types stratified by class types. This research demonstrated that transformer-based models can be used to classify drug reviews based solely on textual reviews.
Collapse
Affiliation(s)
- Akhil Shiju
- Department of Biological Sciences, Florida State University, Tallahassee, Florida, USA
| | - Zhe He
- School of Information, Florida State University, Tallahassee, Florida, USA
| |
Collapse
|
7
|
Yom-Tov E, Lampos V, Inns T, Cox IJ, Edelstein M. Providing early indication of regional anomalies in COVID-19 case counts in England using search engine queries. Sci Rep 2022; 12:2373. [PMID: 35149764 PMCID: PMC8837788 DOI: 10.1038/s41598-022-06340-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Accepted: 01/28/2022] [Indexed: 11/09/2022] Open
Abstract
Prior work has shown the utility of using Internet searches to track the incidence of different respiratory illnesses. Similarly, people who suffer from COVID-19 may query for their symptoms prior to accessing the medical system (or in lieu of it). To assist in the UK government's response to the COVID-19 pandemic we analyzed searches for relevant symptoms on the Bing web search engine from users in England to identify areas of the country where unexpected rises in relevant symptom searches occurred. These were reported weekly to the UK Health Security Agency to assist in their monitoring of the pandemic. Our analysis shows that searches for "fever" and "cough" were the most correlated with future case counts during the initial stages of the pandemic, with searches preceding case counts by up to 21 days. Unexpected rises in search patterns were predictive of anomalous rises in future case counts within a week, reaching an Area Under Curve of 0.82 during the initial phase of the pandemic, and later reducing due to changes in symptom presentation. Thus, analysis of regional searches for symptoms can provide an early indicator (of more than one week) of increases in COVID-19 case counts.
Collapse
Affiliation(s)
- Elad Yom-Tov
- Microsoft Research, Herzliya, Israel.
- Faculty of Industrial Engineering and Management, Technion, Haifa, Israel.
| | - Vasileios Lampos
- Department of Computer Science, University College London, London, UK
| | - Thomas Inns
- UK Health Security Agency, London, UK
- St Helens and Knowsley Teaching Hospitals NHS Trust, Merseyside, UK
| | - Ingemar J Cox
- Department of Computer Science, University College London, London, UK
- Department of Computer Science, University of Copenhagen, Copenhagen, Denmark
| | | |
Collapse
|
8
|
Jacobson NC, Yom-Tov E, Lekkas D, Heinz M, Liu L, Barr PJ. Impact of online mental health screening tools on help-seeking, care receipt, and suicidal ideation and suicidal intent: Evidence from internet search behavior in a large U.S. cohort. J Psychiatr Res 2022; 145:276-283. [PMID: 33199054 PMCID: PMC8106691 DOI: 10.1016/j.jpsychires.2020.11.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 10/30/2020] [Accepted: 11/03/2020] [Indexed: 01/03/2023]
Abstract
INTRODUCTION Most people with psychiatric illnesses do not receive treatment for almost a decade after disorder onset. Online mental health screens reflect one mechanism designed to shorten this lag in help-seeking, yet there has been limited research on the effectiveness of screening tools in naturalistic settings. MATERIAL AND METHODS We examined a cohort of persons directed to a mental health screening tool via the Bing search engine (n = 126,060). We evaluated the impact of tool content on later searches for mental health self-references, self-diagnosis, care seeking, psychoactive medications, suicidal ideation, and suicidal intent. Website characteristics were evaluated by pairs of independent raters to ascertain screen type and content. These included the presence/absence of a suggestive diagnosis, a message on interpretability, as well as referrals to digital treatments, in-person treatments, and crisis services. RESULTS Using machine learning models, the results suggested that screen content predicted later searches with mental health self-references (AUC = 0·73), mental health self-diagnosis (AUC = 0·69), mental health care seeking (AUC = 0·61), psychoactive medications (AUC = 0·55), suicidal ideation (AUC = 0·58), and suicidal intent (AUC = 0·60). Cox-proportional hazards models suggested individuals utilizing tools with in-person care referral were significantly more likely to subsequently search for methods to actively end their life (HR = 1·727, p = 0·007). DISCUSSION Online screens may influence help-seeking behavior, suicidal ideation, and suicidal intent. Websites with referrals to in-person treatments could put persons at greater risk of active suicidal intent. Further evaluation using large-scale randomized controlled trials is needed.
Collapse
Affiliation(s)
- Nicholas C. Jacobson
- Center for Technology and Behavioral Health, Geisel School of Medicine, Dartmouth College, 46 Centerra Parkway, EverGreen Center, Suite 315, Lebanon, NH, 03756 United States,Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Williamson Building, 3rd Floor, 1 Medical Center Drive, Lebanon, NH 03756, United States,Department of Psychiatry, Geisel School of Medicine, Dartmouth College, 1 Medical Center Drive, Lebanon, NH 03756, United States,Quantitative Biomedical Sciences Program, Dartmouth College, Lebanon, NH,Correspondence concerning this article should be addressed to Nicholas C. Jacobson, Center for Technology and Behavioral Health, Geisel School of Medicine, Dartmouth College, 46 Centerra Parkway, Suite 300, Office # 333S, Lebanon, NH 03766, phone: (603) 646-7037;
| | - Elad Yom-Tov
- Microsoft Research, 13 Shenkar Street, Herzeliya, 4672513, Israel; Faculty of Industrial Engineering and Management, Technion - Israel Institute of Technology, Haifa, 3200003, Israel.
| | - Damien Lekkas
- Center for Technology and Behavioral Health, Geisel School of Medicine, Dartmouth College, 46 Centerra Parkway, EverGreen Center, Suite 315, Lebanon, NH, 03756, United States; Quantitative Biomedical Sciences Program, Dartmouth College, NH, United States.
| | - Michael Heinz
- Center for Technology and Behavioral Health, Geisel School of Medicine, Dartmouth College, 46 Centerra Parkway, EverGreen Center, Suite 315, Lebanon, NH, 03756, United States; Dartmouth-Hitchcock Medical Center, 1 Medical Center Drive, Lebanon, NH, 03756, United States.
| | - Lili Liu
- Quantitative Biomedical Sciences Program, Dartmouth College, NH, United States.
| | - Paul J. Barr
- Center for Technology and Behavioral Health, Geisel School of Medicine, Dartmouth College, 46 Centerra Parkway, EverGreen Center, Suite 315, Lebanon, NH, 03756 United States,The Dartmouth Institute, Geisel School of Medicine, Dartmouth College, Williamson Building, 5rd Floor, 1 Medical Center Drive, Lebanon, NH 03756, United States
| |
Collapse
|
9
|
Kamba M, Manabe M, Wakamiya S, Yada S, Aramaki E, Odani S, Miyashiro I. Medical Needs Extraction for Breast Cancer Patients from Question and Answer Services: Natural Language Processing-Based Approach. JMIR Cancer 2021; 7:e32005. [PMID: 34709187 PMCID: PMC8587180 DOI: 10.2196/32005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 09/25/2021] [Accepted: 10/04/2021] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND A large number of patient narratives are available on various web services. As for web question and answer services, patient questions often relate to medical needs, and we expect these questions to provide clues for a better understanding of patients' medical needs. OBJECTIVE This study aimed to extract patients' needs and classify them into thematic categories. Clarifying patient needs is the first step in solving social issues that patients with cancer encounter. METHODS For this study, we used patient question texts containing the key phrase "breast cancer," available at the Yahoo! Japan question and answer service, Yahoo! Chiebukuro, which contains over 60,000 questions on cancer. First, we converted the question text into a vector representation. Next, the relevance between patient needs and existing cancer needs categories was calculated based on cosine similarity. RESULTS The proportion of correct classifications in our proposed method was approximately 70%. Considering the results of classifying questions, we found the variation and the number of needs. CONCLUSIONS We created 3 corpora to classify the problems of patients with cancer. The proposed method was able to classify the problems considering the question text. Moreover, as an application example, the question text that included the side effect signaling of drugs and the unmet needs of cancer patients could be extracted. Revealing these needs is important to fulfill the medical needs of patients with cancer.
Collapse
Affiliation(s)
- Masaru Kamba
- Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, Nara, Japan
| | - Masae Manabe
- Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, Nara, Japan
| | - Shoko Wakamiya
- Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, Nara, Japan
| | - Shuntaro Yada
- Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, Nara, Japan
| | - Eiji Aramaki
- Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, Nara, Japan
| | - Satomi Odani
- Cancer Control Center, Osaka International Cancer Institute, Osaka, Japan
| | - Isao Miyashiro
- Cancer Control Center, Osaka International Cancer Institute, Osaka, Japan
| |
Collapse
|
10
|
Drug Recalls and Significance for Safe Clinical Nurse Specialist Prescribing. CLIN NURSE SPEC 2021; 35:288-290. [PMID: 34606207 DOI: 10.1097/nur.0000000000000628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
11
|
Effenberger M, Kronbichler A, Bettac E, Grabherr F, Grander C, Adolph TE, Mayer G, Zoller H, Perco P, Tilg H. Using Infodemiology Metrics to Assess Public Interest in Liver Transplantation: Google Trends Analysis. J Med Internet Res 2021; 23:e21656. [PMID: 34402801 PMCID: PMC8408753 DOI: 10.2196/21656] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2020] [Revised: 10/23/2020] [Accepted: 06/21/2021] [Indexed: 12/23/2022] Open
Abstract
Background Liver transplantation (LT) is the only curative treatment for end-stage liver disease. Less than 10% of global transplantation needs are met worldwide, and the need for LT is still increasing. The death rates on the waiting list remain too high. Objective It is, therefore, critical to raise awareness among the public and health care providers and in turn increasingly acquire donors. Methods We performed a Google Trends search using the search terms liver transplantation and liver transplant on October 15, 2020. On the basis of the resulting monthly data, the annual average Google Trends indices were calculated for the years 2004 to 2018. We not only investigated the trend worldwide but also used data from the United Network for Organ Sharing (UNOS), Spain, and Eurotransplant. Using pairwise Spearman correlations, Google Trends indices were examined over time and compared with the total number of liver transplants retrieved from the respective official websites of UNOS, the Organización Nacional de Trasplantes, and Eurotransplant. Results From 2004 to 2018, there was a significant decrease in the worldwide Google Trends index from 78.2 in 2004 to 20.5 in 2018 (–71.2%). This trend was more evident in UNOS than in the Eurotransplant group. In the same period, the number of transplanted livers increased worldwide. The waiting list mortality rate was 31% for Eurotransplant and 29% for UNOS. However, in Spain, where there are excellent awareness programs, the Google Trends index remained stable over the years with comparable, increasing LT numbers but a significantly lower waiting list mortality (15%). Conclusions Public awareness in LT has decreased significantly over the past two decades. Therefore, novel awareness programs should be initialized.
Collapse
Affiliation(s)
- Maria Effenberger
- Department of Internal Medicine I, Gastroenterology, Hepatology, Endocrinology and Metabolism, Medical University of Innsbruck, Innsbruck, Austria
| | - Andreas Kronbichler
- Department of Internal Medicine IV, Nephrology and Hypertensiology, Medical University of Innsbruck, Innsbruck, Austria
| | - Erica Bettac
- Department of Psychology, Washington State University Vancouver, Vancouver, WA, United States
| | - Felix Grabherr
- Department of Internal Medicine I, Gastroenterology, Hepatology, Endocrinology and Metabolism, Medical University of Innsbruck, Innsbruck, Austria
| | - Christoph Grander
- Department of Internal Medicine I, Gastroenterology, Hepatology, Endocrinology and Metabolism, Medical University of Innsbruck, Innsbruck, Austria
| | - Timon Erik Adolph
- Department of Internal Medicine I, Gastroenterology, Hepatology, Endocrinology and Metabolism, Medical University of Innsbruck, Innsbruck, Austria
| | - Gert Mayer
- Department of Internal Medicine IV, Nephrology and Hypertensiology, Medical University of Innsbruck, Innsbruck, Austria
| | - Heinz Zoller
- Department of Internal Medicine I, Gastroenterology, Hepatology, Endocrinology and Metabolism, Medical University of Innsbruck, Innsbruck, Austria
| | - Paul Perco
- Department of Internal Medicine IV, Nephrology and Hypertensiology, Medical University of Innsbruck, Innsbruck, Austria
| | - Herbert Tilg
- Department of Internal Medicine I, Gastroenterology, Hepatology, Endocrinology and Metabolism, Medical University of Innsbruck, Innsbruck, Austria
| |
Collapse
|
12
|
Hswen Y, Yom-Tov E. Analysis of a Vaping-Associated Lung Injury Outbreak through Participatory Surveillance and Archival Internet Data. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:ijerph18158203. [PMID: 34360495 PMCID: PMC8346109 DOI: 10.3390/ijerph18158203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 07/28/2021] [Accepted: 07/30/2021] [Indexed: 11/22/2022]
Abstract
The US Centers for Disease Control and Prevention alerted of a suspected outbreak of lung illness associated with using E-cigarette products in September 2019. At the time that the CDC published its alert little was known about the causes of the outbreak or who was at risk for it. Here we provide insights into the outbreak through analysis of passive reporting and participatory surveillance. We collected data about vaping habits and associated adverse reactions from four data sources pertaining to people in the USA: A participatory surveillance platform (YouVape), Reddit, Google Trends, and Bing. Data were analyzed to identify vaping behaviors and reported adverse events. These were correlated among sources and with prior reports. Data was obtained from 720 YouVape users, 4331 Reddit users, and over 1 million Bing users. Large geographic variation was observed across vaping products. Significant correlation was found among the data sources in reported adverse reactions. Models of participatory surveillance data found specific product and adverse reaction associations. Specifically, cannabidiol was found to be associated with fever, while tetrahydrocannabinol was found to be correlated with diarrhea. Our results demonstrate that utilization of different, complementary, online data sources provide a holistic view of vaping associated lung injury while augmenting traditional data sources.
Collapse
Affiliation(s)
- Yulin Hswen
- Department of Epidemiology and Biostatistics, University of California at San Francisco, San Francisco, CA 94158, USA;
- Bakar Computational Health Sciences Institute, University of California at San Francisco, San Francisco, CA 94143, USA
- Innovation Program, Boston Children’s Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Elad Yom-Tov
- Microsoft Research Israel, 3 Alan Turing Str., Herzeliya 4672415, Israel
- Faculty of Industrial Engineering and Management, Technion, Haifa 3200000, Israel
- Correspondence:
| |
Collapse
|
13
|
Chen R, Zhang Y, Dou Z, Chen F, Xie K, Wang S. Data Sharing and Privacy in Pharmaceutical Studies. Curr Pharm Des 2021; 27:911-918. [PMID: 33438533 DOI: 10.2174/1381612827999210112204732] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Accepted: 09/30/2020] [Indexed: 11/22/2022]
Abstract
Adverse drug events have been a long-standing concern for the wide-ranging harms to public health, and the substantial disease burden. The key to diminish or eliminate the impacts is to build a comprehensive pharmacovigilance system. Application of the "big data" approach has been proved to assist the detection of adverse drug events by involving previously unavailable data sources and promoting health information exchange. Even though challenges and potential risks still remain. The lack of effective privacy-preserving measures in the flow of medical data is the most important Accepted: one, where urgent actions are required to prevent the threats and facilitate the construction of pharmacovigilance systems. Several privacy protection methods are reviewed in this article, which may be helpful to break the barrier.
Collapse
Affiliation(s)
- Rufan Chen
- Department of Bioinformatics, Hangzhou Nuowei Information Technology Co., Ltd, Hangzhou, China
| | - Yi Zhang
- Department of Cardiology, Xinhua Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, China
| | - Zuochao Dou
- Department of Bioinformatics, Hangzhou Nuowei Information Technology Co., Ltd, Hangzhou, China
| | - Feng Chen
- Department of Bioinformatics, Hangzhou Nuowei Information Technology Co., Ltd, Hangzhou, China
| | - Kang Xie
- Key Lab of Information Network Security of Ministry of Public Security, the Third Research Institute of Ministry of Public Security, Shanghai, China
| | - Shuang Wang
- Department of Bioinformatics, Hangzhou Nuowei Information Technology Co., Ltd, Hangzhou, China
| |
Collapse
|
14
|
Ayers JW, Althouse BM, Poliak A, Leas EC, Nobles AL, Dredze M, Smith D. Quantifying Public Interest in Police Reforms by Mining Internet Search Data Following George Floyd's Death. J Med Internet Res 2020; 22:e22574. [PMID: 33084578 PMCID: PMC7641778 DOI: 10.2196/22574] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 08/21/2020] [Accepted: 09/21/2020] [Indexed: 01/21/2023] Open
Abstract
BACKGROUND The death of George Floyd while in police custody has resurfaced serious questions about police conduct that result in the deaths of unarmed persons. OBJECTIVE Data-driven strategies that identify and prioritize the public's needs may engender a public health response to improve policing. We assessed how internet searches indicative of interest in police reform changed after Mr Floyd's death. METHODS We monitored daily Google searches (per 10 million total searches) that included the terms "police" and "reform(s)" (eg, "reform the police," "best police reforms," etc) originating from the United States between January 1, 2010, through July 5, 2020. We also monitored searches containing the term "police" with "training," "union(s)," "militarization," or "immunity" as markers of interest in the corresponding reform topics. RESULTS The 41 days following Mr Floyd's death corresponded with the greatest number of police "reform(s)" searches ever recorded, with 1,350,000 total searches nationally. Searches increased significantly in all 50 states and Washington DC. By reform topic, nationally there were 1,220,000 total searches for "police" and "union(s)"; 820,000 for "training"; 360,000 for "immunity"; and 72,000 for "militarization." In terms of searches for all policy topics by state, 33 states searched the most for "training," 16 for "union(s)," and 2 for "immunity." States typically in the southeast had fewer queries related to any police reform topic than other states. States that had a greater percentage of votes for President Donald Trump during the 2016 election searched more often for police "union(s)" while states favoring Secretary Hillary Clinton searched more for police "training." CONCLUSIONS The United States is at a historical juncture, with record interest in topics related to police reform with variability in search terms across states. Policy makers can respond to searches by considering the policies their constituencies are searching for online, notably police training and unions. Public health leaders can respond by engaging in the subject of policing and advocating for evidence-based policy reforms.
Collapse
Affiliation(s)
- John W Ayers
- University of California San Diego, La Jolla, CA, United States
| | - Benjamin M Althouse
- Institute for Disease Modeling, Bill and Melinda Gates Foundation, Seattle, WA, United States
| | - Adam Poliak
- Barnard College, Columbia University, New York, NY, United States
| | - Eric C Leas
- University of California San Diego, La Jolla, CA, United States
| | - Alicia L Nobles
- University of California San Diego, La Jolla, CA, United States
| | - Mark Dredze
- Johns Hopkins University, Baltimore, MD, United States
| | - Davey Smith
- University of California San Diego, La Jolla, CA, United States
| |
Collapse
|
15
|
Li Y, Jimeno Yepes A, Xiao C. Combining Social Media and FDA Adverse Event Reporting System to Detect Adverse Drug Reactions. Drug Saf 2020; 43:893-903. [PMID: 32385840 PMCID: PMC7434724 DOI: 10.1007/s40264-020-00943-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
INTRODUCTION Adverse drug reactions (ADRs) are unintended reactions caused by a drug or combination of drugs taken by a patient. The current safety surveillance system relies on spontaneous reporting systems (SRSs) and more recently on observational health data; however, ADR detection may be delayed and lack geographic diversity. The broad scope of social media conversations, such as those on Twitter, can include health-related topics. Consequently, these data could be used to detect potentially novel ADRs with less latency. Although research regarding ADR detection using social media has made progress, findings are based on single information sources, and no study has yet integrated drug safety evidence from both an SRS and Twitter. OBJECTIVE The aim of this study was to combine signals from an SRS and Twitter to facilitate the detection of safety signals and compare the performance of the combined system with signals generated by individual data sources. METHODS We extracted potential drug-ADR posts from Twitter, used Monte Carlo expectation maximization to generate drug safety signals from both the US FDA Adverse Event Reporting System and posts from Twitter, and then integrated these signals using a Bayesian hierarchical model. The results from the integrated system and two individual sources were evaluated using a reference standard derived from drug labels. Area under the receiver operating characteristics curve (AUC) was computed to measure performance. RESULTS We observed a significant improvement in the AUC of the combined system when comparing it with Twitter alone, and no improvement when comparing with the SRS alone. The AUCs ranged from 0.587 to 0.637 for the combined SRS and Twitter, from 0.525 to 0.534 for Twitter alone, and from 0.612 to 0.642 for the SRS alone. The results varied because different preprocessing procedures were applied to Twitter. CONCLUSION The accuracy of signal detection using social media can be improved by combining signals with those from SRSs. However, the combined system cannot achieve better AUC performance than data from FAERS alone, which may indicate that Twitter data are not ready to be integrated into a purely data-driven combination system.
Collapse
Affiliation(s)
- Ying Li
- Center for Computational Health, IBM Thomas J. Watson Research Center, 1101 Kitchawan Rd, Yorktown Heights, NY, 10598, USA.
| | | | - Cao Xiao
- Analytics Center of Excellence, IQVIA, Cambridge, MA, USA
| |
Collapse
|
16
|
Mavragani A. Infodemiology and Infoveillance: Scoping Review. J Med Internet Res 2020; 22:e16206. [PMID: 32310818 PMCID: PMC7189791 DOI: 10.2196/16206] [Citation(s) in RCA: 111] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Revised: 02/05/2020] [Accepted: 02/08/2020] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Web-based sources are increasingly employed in the analysis, detection, and forecasting of diseases and epidemics, and in predicting human behavior toward several health topics. This use of the internet has come to be known as infodemiology, a concept introduced by Gunther Eysenbach. Infodemiology and infoveillance studies use web-based data and have become an integral part of health informatics research over the past decade. OBJECTIVE The aim of this paper is to provide a scoping review of the state-of-the-art in infodemiology along with the background and history of the concept, to identify sources and health categories and topics, to elaborate on the validity of the employed methods, and to discuss the gaps identified in current research. METHODS The PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines were followed to extract the publications that fall under the umbrella of infodemiology and infoveillance from the JMIR, PubMed, and Scopus databases. A total of 338 documents were extracted for assessment. RESULTS Of the 338 studies, the vast majority (n=282, 83.4%) were published with JMIR Publications. The Journal of Medical Internet Research features almost half of the publications (n=168, 49.7%), and JMIR Public Health and Surveillance has more than one-fifth of the examined studies (n=74, 21.9%). The interest in the subject has been increasing every year, with 2018 featuring more than one-fourth of the total publications (n=89, 26.3%), and the publications in 2017 and 2018 combined accounted for more than half (n=171, 50.6%) of the total number of publications in the last decade. The most popular source was Twitter with 45.0% (n=152), followed by Google with 24.6% (n=83), websites and platforms with 13.9% (n=47), blogs and forums with 10.1% (n=34), Facebook with 8.9% (n=30), and other search engines with 5.6% (n=19). As for the subjects examined, conditions and diseases with 17.2% (n=58) and epidemics and outbreaks with 15.7% (n=53) were the most popular categories identified in this review, followed by health care (n=39, 11.5%), drugs (n=40, 10.4%), and smoking and alcohol (n=29, 8.6%). CONCLUSIONS The field of infodemiology is becoming increasingly popular, employing innovative methods and approaches for health assessment. The use of web-based sources, which provide us with information that would not be accessible otherwise and tackles the issues arising from the time-consuming traditional methods, shows that infodemiology plays an important role in health informatics research.
Collapse
Affiliation(s)
- Amaryllis Mavragani
- Department of Computing Science and Mathematics, Faculty of Natural Sciences, University of Stirling, Stirling, United Kingdom
| |
Collapse
|
17
|
Hochberg I, Allon R, Yom-Tov E. Assessment of the Frequency of Online Searches for Symptoms Before Diagnosis: Analysis of Archival Data. J Med Internet Res 2020; 22:e15065. [PMID: 32141835 PMCID: PMC7084283 DOI: 10.2196/15065] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Revised: 10/07/2019] [Accepted: 12/16/2019] [Indexed: 12/18/2022] Open
Abstract
Background Surveys suggest that a large proportion of people use the internet to search for information on medical symptoms they experience and that around one-third of the people in the United States self-diagnose using online information. However, surveys are known to be biased, and the true rates at which people search for information on their medical symptoms before receiving a formal medical diagnosis are unknown. Objective This study aimed to estimate the rate at which people search for information on their medical symptoms before receiving a formal medical diagnosis by a health professional. Methods We collected queries made on a general-purpose internet search engine by people in the United States who self-identified their diagnosis from 1 of 20 medical conditions. We focused on conditions that have evident symptoms and are neither screened systematically nor a part of usual medical care. Thus, they are generally diagnosed after the investigation of specific symptoms. We evaluated how many of these people queried for symptoms associated with their medical condition before their formal diagnosis. In addition, we used a survey questionnaire to assess the familiarity of laypeople with the symptoms associated with these conditions. Results On average, 15.49% (1792/12,367, SD 8.4%) of people queried about symptoms associated with their medical condition before receiving a medical diagnosis. A longer duration between the first query for a symptom and the corresponding diagnosis was correlated with an increased likelihood of people querying about those symptoms (rho=0.6; P=.005); similarly, unfamiliarity with the association between a condition and its symptom was correlated with an increased likelihood of people querying about those symptoms (rho=−0.47; P=.08). In addition, worrying symptoms were 14% more likely to be queried about. Conclusions Our results indicate that there is large variability in the percentage of people who query the internet for their symptoms before a formal medical diagnosis is made. This finding has important implications for systems that attempt to screen for medical conditions.
Collapse
Affiliation(s)
- Irit Hochberg
- Institute of Endocrinology, Diabetes, and Metabolism, Rambam Health Care Campus, Haifa, Israel.,Bruce Rappaport Faculty of Medicine, Technion - Israel Institute of Technology, Haifa, Israel
| | - Raviv Allon
- Bruce Rappaport Faculty of Medicine, Technion - Israel Institute of Technology, Haifa, Israel
| | - Elad Yom-Tov
- Microsoft Research, Herzeliya, Israel.,Faculty of Industrial Engineering and Management, Technion - Israel Institute of Technology, Haifa, Israel
| |
Collapse
|
18
|
Sadilek A, Hswen Y, Bavadekar S, Shekel T, Brownstein JS, Gabrilovich E. Lymelight: forecasting Lyme disease risk using web search data. NPJ Digit Med 2020; 3:16. [PMID: 32047861 PMCID: PMC7000681 DOI: 10.1038/s41746-020-0222-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Accepted: 12/19/2019] [Indexed: 02/02/2023] Open
Abstract
Lyme disease is the most common tick-borne disease in the Northern Hemisphere. Existing estimates of Lyme disease spread are delayed a year or more. We introduce Lymelight-a new method for monitoring the incidence of Lyme disease in real-time. We use a machine-learned classifier of web search sessions to estimate the number of individuals who search for possible Lyme disease symptoms in a given geographical area for two years, 2014 and 2015. We evaluate Lymelight using the official case count data from CDC and find a 92% correlation (p < 0.001) at county level. Importantly, using web search data allows us not only to assess the incidence of the disease, but also to examine the appropriateness of treatments subsequently searched for by the users. Public health implications of our work include monitoring the spread of vector-borne diseases in a timely and scalable manner, complementing existing approaches through real-time detection, which can enable more timely interventions. Our analysis of treatment searches may also help reduce misdiagnosis of the disease.
Collapse
Affiliation(s)
| | - Yulin Hswen
- Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, MA USA
- Computational Epidemiology Lab, Boston Children’s Hospital, Boston, MA USA
| | | | | | - John S. Brownstein
- Computational Epidemiology Lab, Boston Children’s Hospital, Boston, MA USA
- Department of Pediatrics, Harvard Medical School, Massachusetts, USA
| | | |
Collapse
|
19
|
Yom-Tov E, Lebwohl B. Adverse events associated with colonoscopy; an examination of online concerns. BMC Gastroenterol 2019; 19:207. [PMID: 31795939 PMCID: PMC6889678 DOI: 10.1186/s12876-019-1127-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/25/2019] [Accepted: 11/21/2019] [Indexed: 01/08/2023] Open
Abstract
Background Colonoscopy as a screening and diagnostic tool is generally safe and well-tolerated, and significant complications are rare. The rate of more mild adverse effects is difficult to estimate, particularly when such effects do not result in hospital admission. We aimed to identify the rate and timing of adverse effects as reported by users querying symptoms on an internet search engine. Methods We identified queries made to Bing originating from users in the United States containing the word “colonoscopy” during a 12-month period and identified those queries in which the timing of colonoscopy could be estimated. We then identified queries from those same users for medical symptoms during the time span from 5 days before through 30 days after the colonoscopy date. Results Of 641,223 users mentioning colonoscopy, 7013 (1.1%) had a query that enabled identification of their colonoscopy date. The majority of queries about colonoscopy preceded the procedure, and concerned diet. 28% of colonoscopy-related queries were made afterwards, and included queries about diarrhea and cramps, with 2.6% of users querying respiratory symptoms after the procedure, including cough (1.2%) and pneumonia (0.6%). Respiratory symptoms rose significantly at days 7–10 after the colonoscopy. Conclusions Internet search queries for respiratory symptoms rose approximately one week after queries relating to colonoscopy, raising the possibility that such symptoms are an under-reported late adverse effect of the procedure. Given the widespread use of colonoscopy as a screening modality and the rise of anesthesia-assisted colonoscopy in the United States in recent years, this signal is of potential public health concern.
Collapse
Affiliation(s)
- Elad Yom-Tov
- Microsoft Research, Herzeliya, Israel. .,Faculty of Industrial Engineering and Management, Technion - Israel Institute of Technology, Haifa, Israel.
| | | |
Collapse
|
20
|
Borchert JS, Wang B, Ramzanali M, Stein AB, Malaiyandi LM, Dineley KE. Adverse Events Due to Insomnia Drugs Reported in a Regulatory Database and Online Patient Reviews: Comparative Study. J Med Internet Res 2019; 21:e13371. [PMID: 31702558 PMCID: PMC6874799 DOI: 10.2196/13371] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Revised: 08/22/2019] [Accepted: 09/26/2019] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Patient online drug reviews are a resource for other patients seeking information about the practical benefits and drawbacks of drug therapies. Patient reviews may also serve as a source of postmarketing safety data that are more user-friendly than regulatory databases. However, the reliability of online reviews has been questioned, because they do not undergo professional review and lack means of verification. OBJECTIVE We evaluated online reviews of hypnotic medications, because they are commonly used and their therapeutic efficacy is particularly amenable to patient self-evaluation. Our primary objective was to compare the types and frequencies of adverse events reported to the Food and Drug Administration Adverse Event Reporting System (FAERS) with analogous information in patient reviews on the consumer health website Drugs.com. The secondary objectives were to describe patient reports of efficacy and adverse events and assess the influence of medication cost, effectiveness, and adverse events on user ratings of hypnotic medications. METHODS Patient ratings and narratives were retrieved from 1407 reviews on Drugs.com between February 2007 and March 2018 for eszopiclone, ramelteon, suvorexant, zaleplon, and zolpidem. Reviews were coded to preferred terms in the Medical Dictionary for Regulatory Activities. These reviews were compared to 5916 cases in the FAERS database from January 2015 to September 2017. RESULTS Similar adverse events were reported to both Drugs.com and FAERS. Both resources identified a lack of efficacy as a common complaint for all five drugs. Both resources revealed that amnesia commonly occurs with eszopiclone, zaleplon, and zolpidem, while nightmares commonly occur with suvorexant. Compared to FAERS, online reviews of zolpidem reported a much higher frequency of amnesia and partial sleep activities. User ratings were highest for zolpidem and lowest for suvorexant. Statistical analyses showed that patient ratings are influenced by considerations of efficacy and adverse events, while drug cost is unimportant. CONCLUSIONS For hypnotic medications, online patient reviews and FAERS emphasized similar adverse events. Online reviewers rated drugs based on perception of efficacy and adverse events. We conclude that online patient reviews of hypnotics are a valid source that can supplement traditional adverse event reporting systems.
Collapse
Affiliation(s)
- Jill S Borchert
- Chicago College of Pharmacy, Midwestern University, Downers Grove, IL, United States
| | - Bo Wang
- Chicago College of Osteopathic Medicine, Midwestern University, Downers Grove, IL, United States
| | - Muzaina Ramzanali
- Chicago College of Pharmacy, Midwestern University, Downers Grove, IL, United States
| | - Amy B Stein
- Office of Research and Sponsored Programs, Midwestern University, Glendale, AZ, United States
| | - Latha M Malaiyandi
- College of Graduate Studies, Midwestern University, Downers Grove, IL, United States
| | - Kirk E Dineley
- College of Graduate Studies, Midwestern University, Downers Grove, IL, United States
| |
Collapse
|
21
|
Hauben M, Reynolds R, Caubel P. Deconstructing the Pharmacovigilance Hype Cycle. Clin Ther 2019; 40:1981-1990.e3. [PMID: 30545608 DOI: 10.1016/j.clinthera.2018.10.021] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2018] [Revised: 10/11/2018] [Accepted: 10/24/2018] [Indexed: 12/31/2022]
Abstract
Data science is making increasing contributions to pharmacovigilance. Although the technical innovation of these works are indisputable, efficient progress in real-world pharmacovigilance signal detection may be hampered by corresponding technology life cycle effects, with a resulting tendency to conclude that, with large enough datasets and intricate algorithms, "the numbers speak for themselves," discounting the importance of clinical and scientific judgment. A practical consequence is overzealous declarations regarding the safety or lack of safety of drugs. We describe these concerns through a critical discussion of key results and conclusions from case studies selected to illustrate these points.
Collapse
|
22
|
Hochberg I, Daoud D, Shehadeh N, Yom-Tov E. Can internet search engine queries be used to diagnose diabetes? Analysis of archival search data. Acta Diabetol 2019; 56:1149-1154. [PMID: 31093762 DOI: 10.1007/s00592-019-01350-5] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Accepted: 04/16/2019] [Indexed: 11/30/2022]
Abstract
AIMS Diabetes is often diagnosed late. This study aimed to assess the possibility for earlier detection of diabetes from search data, using predictive models trained on large-scale data. METHODS We extracted all English-language queries made by people in the USA to Bing during 1 year and identified queries containing symptoms of diabetes. We compared the ability of four different prediction models (linear regression, logistic regression, decision tree and random forest) to distinguish between users who stated that they were diagnosed with diabetes and users who did not refer to diabetes or diabetes drugs but queried about at least one of the symptoms. RESULTS We identified 11,050 "new diabetes users" who stated they had been diagnosed with diabetes and approximately 11.5 million "control users" who queried about symptoms without querying for terms related to diabetes. Both the logistic regression and the random forest models were able to distinguish between the populations with an area under curve of 0.92 which translates to a positive predictive value of 56% at a false-positive rate of 1%. The model could identify patients up to 240 days before they mentioned being diagnosed. CONCLUSIONS Some undiagnosed diabetes patients can be detected accurately according to their symptom queries to a search engine. Such earlier diagnosis, especially in cases of type 1 diabetes, could be clinically meaningful. The ability of search engines to serve as a population-wide screening tool could potentially be improved using additional data provided by users.
Collapse
Affiliation(s)
- Irit Hochberg
- Institute of Endocrinology, Diabetes and Metabolism, Rambam Health Care Campus, 8 Ha'Aliya Street, POB 9602, 31096, Haifa, Israel.
| | - Deeb Daoud
- Institute of Endocrinology, Diabetes and Metabolism, Rambam Health Care Campus, 8 Ha'Aliya Street, POB 9602, 31096, Haifa, Israel
| | - Naim Shehadeh
- Institute of Endocrinology, Diabetes and Metabolism, Rambam Health Care Campus, 8 Ha'Aliya Street, POB 9602, 31096, Haifa, Israel
- Bruce Rappaport Faculty of Medicine, Technion - Israel Institute of Technology, Haifa, Israel
| | | |
Collapse
|
23
|
Dai HJ, Wang CK. Classifying adverse drug reactions from imbalanced twitter data. Int J Med Inform 2019; 129:122-132. [PMID: 31445246 DOI: 10.1016/j.ijmedinf.2019.05.017] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2018] [Revised: 04/07/2019] [Accepted: 05/21/2019] [Indexed: 10/26/2022]
Abstract
BACKGROUND Nowadays, social media are often being used by general public to create and share public messages related to their health. With the global increase in social media usage, there is a trend of posting information related to adverse drug reactions (ADR). Mining the social media data for this type of information will be helpful for pharmacological post-marketing surveillance and monitoring. Although the concept of using social media to facilitate pharmacovigilance is convincing, construction of automatic ADR detection systems remains a challenge because the corpora compiled from social media tend to be highly imbalanced, posing a major obstacle to the development of classifiers with reliable performance. METHODS Several methods have been proposed to address the challenge of imbalanced corpora. However, we are not aware of any studies that investigated the effectiveness of the strategies of dealing with the problem of imbalanced data in the context of ADR detection from social media. In light of this, we evaluated a variety of imbalanced techniques and proposed a novel word embedding-based synthetic minority over-sampling technique (WESMOTE), which synthesizes new training examples from the sentence representation based on word embeddings. We compared the performance of all methods on two large imbalanced datasets released for the purpose of detecting ADR posts. RESULTS In comparison with the state-of-the-art approaches, the classifiers that incorporated imbalanced classification techniques achieved comparable or better F-scores. All of our best performing configurations combined random under-sampling with techniques including the proposed WESMOTE, boosting and ensemble, implying that an integration of these approaches with under-sampling provides a reliable solution for large imbalanced social media datasets. Furthermore, ensemble-based methods like vote-based under-sampling (VUE) and random under-sampling boosting can be alternatives for the hybrid synthetic methods because both methods increase the diversity of the created weak classifiers, leading to better recall and overall F-scores for the minority classes. CONCLUSIONS Data collected from the social media are usually very large and highly imbalanced. In order to maximize the performance of a classifier trained on such data, applications of imbalanced strategies are required. We considered several practical methods for handling imbalanced Twitter data along with their performance on the binary classification task with respect to ADRs. In conclusion, the following practical insights are gained: 1) When dealing with text classification, the proposed word embedding-based synthetic minority over-sampling technique is more effective than traditional synthetic-based over-sampling methods. 2) In cases where large amounts of training data are available, the imbalanced strategies combined with under-sampling techniques are preferred. 3) Finally, employment of advanced methods does not guarantee better performance than simpler ones such as VUE, which achieved high performance with advantages like faster building time and ease of development.
Collapse
Affiliation(s)
- Hong-Jie Dai
- Department of Electrical Engineering, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan, Republic of China; Post Baccalaureate Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan, Republic of China.
| | - Chen-Kai Wang
- Big Data laboratories of Chunghwa Telecom Laboratories, Taoyuan, Taiwan, Republic of China.
| |
Collapse
|
24
|
Lebwohl B, Yom-Tov E. Symptoms Prompting Interest in Celiac Disease and the Gluten-Free Diet: Analysis of Internet Search Term Data. J Med Internet Res 2019; 21:e13082. [PMID: 30958273 PMCID: PMC6475820 DOI: 10.2196/13082] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2018] [Revised: 02/05/2019] [Accepted: 02/11/2019] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Celiac disease, a common immune-based disease triggered by gluten, has diverse clinical manifestations, and the relative distribution of symptoms leading to diagnosis has not been well characterized in the population. OBJECTIVE This study aimed to use search engine data to identify a set of symptoms and conditions that would identify individuals at elevated likelihood of a subsequent celiac disease diagnosis. We also measured the relative prominence of these search terms before versus after a search related to celiac disease. METHODS We extracted English-language queries submitted to the Bing search engine in the United States and identified those who submitted a new celiac-related query during a 1-month period, without any celiac-related queries in the preceding 9 months. We compared the ratio between the number of times that each symptom or condition was asked in the 14 days preceding the first celiac-related query of each person and the number of searches for that same symptom or condition in the 14 days after the celiac-related query. RESULTS We identified 90,142 users who made a celiac-related query, of whom 6528 (7%) exhibited sustained interest, defined as making a query on more than 1 day. Though a variety of symptoms and associated conditions were also queried before a celiac-related query, the maximum area under the receiver operating characteristic curve was 0.53. The symptom most likely to be queried more before than after a celiac-related query was diarrhea (query ratio [QR] 1.28). Extraintestinal symptoms queried before a celiac disease query included headache (QR 1.26), anxiety (QR 1.10), depression (QR 1.03), and attention-deficit hyperactivity disorder (QR 1.64). CONCLUSIONS We found an increase in antecedent searches for symptoms known to be associated with celiac disease, a rise in searches for depression and anxiety, and an increase in symptoms that are associated with celiac disease but may not be reported to health care providers. The protean clinical manifestations of celiac disease are reflected in the diffuse nature of antecedent internet queries of those interested in celiac disease, underscoring the challenge of effective case-finding strategies.
Collapse
Affiliation(s)
- Benjamin Lebwohl
- Celiac Disease Center, Columbia University, New York, NY, United States
| | - Elad Yom-Tov
- Microsoft Research, Herzeliya, Israel.,Technion, Haifa, Israel
| |
Collapse
|
25
|
Nitzburg G, Weber I, Yom-Tov E. Internet Searches for Medical Symptoms Before Seeking Information on 12-Step Addiction Treatment Programs: A Web-Search Log Analysis. J Med Internet Res 2019; 21:e10946. [PMID: 31066685 PMCID: PMC6533047 DOI: 10.2196/10946] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2018] [Revised: 11/28/2018] [Accepted: 01/26/2019] [Indexed: 12/12/2022] Open
Abstract
Background Brief intervention is a critical method for identifying patients with problematic substance use in primary care settings and for motivating them to consider treatment options. However, despite considerable evidence of delay discounting in patients with substance use disorders, most brief advice by physicians focuses on the long-term negative medical consequences, which may not be the best way to motivate patients to seek treatment information. Objective Identification of the specific symptoms that most motivate individuals to seek treatment information may offer insights for further improving brief interventions. To this end, we used anonymized internet search engine data to investigate which medical conditions and symptoms preceded searches for 12-step meeting locators and general 12-step information. Methods We extracted all queries made by people in the United States on the Bing search engine from November 2016 to July 2017. These queries were filtered for those who mentioned seeking Alcoholics Anonymous (AA) or Narcotics Anonymous (NA); in addition, queries that contained a medical symptom or condition or a synonym thereof were analyzed. We identified medical symptoms and conditions that predicted searches for seeking treatment at different time lags. Specifically, symptom queries were first determined to be significantly predictive of subsequent 12-step queries if the probability of querying a medical symptom by those who later sought information about the 12-step program exceeded the probability of that same query being made by a comparison group of all other Bing users in the United States. Second, we examined symptom queries preceding queries on the 12-step program at time lags of 0-7 days, 7-14 days, and 14-30 days, where the probability of asking about a medical symptom was greater in the 30-day time window preceding 12-step program information-seeking as compared to all previous times that the symptom was queried. Results In our sample of 11,784 persons, we found 10 medical symptoms that predicted AA information seeking and 9 symptoms that predicted NA information seeking. Of these symptoms, a substantial number could be categorized as nonsevere in nature. Moreover, when medical symptom persistence was examined across a 1-month time period, a substantial number of nonsevere, yet persistent, symptoms were identified. Conclusions Our results suggest that many common or nonsevere medical symptoms and conditions motivate subsequent interest in AA and NA programs. In addition to highlighting severe long-term consequences, brief interventions could be restructured to highlight how increasing substance misuse can worsen discomfort from common medical symptoms in the short term, as well as how these worsening symptoms could exacerbate social embarrassment or decrease physical attractiveness.
Collapse
Affiliation(s)
- George Nitzburg
- Teachers College, Columbia University, New York, NY, United States
| | - Ingmar Weber
- Social Computing Department, Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar
| | - Elad Yom-Tov
- Microsoft Research, Redmond, WA, United States.,Microsoft Research, Herzeliya, Israel.,Faculty of Industrial Engineering and Management, Technion - Israel Institute of Technology, Haifa, Israel
| |
Collapse
|
26
|
Harnessing social media data for pharmacovigilance: a review of current state of the art, challenges and future directions. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS 2019. [DOI: 10.1007/s41060-019-00175-3] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
|
27
|
|
28
|
Gunn LH, Ter Horst E, Markossian TW, Molina G. Online interest regarding violent attacks, gun control, and gun purchase: A causal analysis. PLoS One 2018; 13:e0207924. [PMID: 30485315 PMCID: PMC6261600 DOI: 10.1371/journal.pone.0207924] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2018] [Accepted: 11/06/2018] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND Increased interest about gun ownership and gun control are oftentimes driven by informational shocks in a common factor, namely violent attacks, and the perceived need for higher levels of safety. A causal depiction of the societal interest around violent attacks, gun control and gun purchase, both synchronous and over time, should be a stepping stone for designing future strategies regarding the safety concerns of the U.S. population. OBJECTIVE Examine the causal relationships between unexpected increases in population interest about violent attacks, gun control, and gun purchase. METHODS Relationships among online searches for information about violent attacks, gun control, and gun purchase occurring between 2004 and 2017 in the U.S. are explained through a novel structural vector autoregressive time series model to account for simultaneous causal relationships. RESULTS More than 20% of the stationary variability in each of gun control and gun purchase interest can be explained by the remaining factors. Gun control interest appears to be caused, in part, by violent attacks informational shocks, yet violent attacks, although impactful, have a lesser effect than gun control debate on long-term gun ownership interests. CONCLUSIONS The form in which gun control has been introduced in public debate may have further increased gun ownership interest. Reactive gun purchase interest may be an unintended side effect of gun control debate. U.S. policymakers may need to rethink current approaches to promotion of gun control, and whether societal policy debate without policy outcomes could be having unintended effects.
Collapse
Affiliation(s)
- Laura H Gunn
- Department of Public Health Sciences, Health Informatics and Analytics Program, University of North Carolina at Charlotte, Charlotte, NC, United States of America.,School of Public Health, Faculty of Medicine, Imperial College London, London, United Kingdom
| | - Enrique Ter Horst
- Universidad de los Andes (Uniandes), Facultad de administracion, Bogota, Colombia
| | - Talar W Markossian
- Department of Public Health Sciences, Stritch School of Medicine, Loyola University Chicago, Chicago, Illinois, United States of America
| | - German Molina
- Quantitative Research, Idalion Capital Group, London, United Kingdom
| |
Collapse
|
29
|
Kürzinger ML, Schück S, Texier N, Abdellaoui R, Faviez C, Pouget J, Zhang L, Tcherny-Lessenot S, Lin S, Juhaeri J. Web-Based Signal Detection Using Medical Forums Data in France: Comparative Analysis. J Med Internet Res 2018; 20:e10466. [PMID: 30459145 PMCID: PMC6280030 DOI: 10.2196/10466] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Revised: 06/29/2018] [Accepted: 06/29/2018] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND While traditional signal detection methods in pharmacovigilance are based on spontaneous reports, the use of social media is emerging. The potential strength of Web-based data relies on their volume and real-time availability, allowing early detection of signals of disproportionate reporting (SDRs). OBJECTIVE This study aimed (1) to assess the consistency of SDRs detected from patients' medical forums in France compared with those detected from the traditional reporting systems and (2) to assess the ability of SDRs in identifying earlier than the traditional reporting systems. METHODS Messages posted on patients' forums between 2005 and 2015 were used. We retained 8 disproportionality definitions. Comparison of SDRs from the forums with SDRs detected in VigiBase was done by describing the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, receiver operating characteristics curve, and the area under the curve (AUC). The time difference in months between the detection dates of SDRs from the forums and VigiBase was provided. RESULTS The comparison analysis showed that the sensitivity ranged from 29% to 50.6%, the specificity from 86.1% to 95.5%, the PPV from 51.2% to 75.4%, the NPV from 68.5% to 91.6%, and the accuracy from 68% to 87.7%. The AUC reached 0.85 when using the metric empirical Bayes geometric mean. Up to 38% (12/32) of the SDRs were detected earlier in the forums than that in VigiBase. CONCLUSIONS The specificity, PPV, and NPV were high. The overall performance was good, showing that data from medical forums may be a valuable source for signal detection. In total, up to 38% (12/32) of the SDRs could have been detected earlier, thus, ensuring the increased safety of patients. Further enhancements are needed to investigate the reliability and validation of patients' medical forums worldwide, the extension of this analysis to all possible drugs or at least to a wider selection of drugs, as well as to further assess performance against established signals.
Collapse
Affiliation(s)
| | | | | | | | | | - Julie Pouget
- Information Technology and Solutions, Sanofi, Lyon, France
| | - Ling Zhang
- Global Pharmacovigilance, Sanofi, Bridgewater, NJ, United States
| | | | - Stephen Lin
- Global Pharmacovigilance, Sanofi, Bridgewater, NJ, United States
| | - Juhaeri Juhaeri
- Epidemiology and Benefit Risk Evaluation, Sanofi, Bridgewater, NJ, United States
| |
Collapse
|
30
|
Trifirò G, Sultana J, Bate A. From Big Data to Smart Data for Pharmacovigilance: The Role of Healthcare Databases and Other Emerging Sources. Drug Saf 2018; 41:143-149. [PMID: 28840504 DOI: 10.1007/s40264-017-0592-4] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
In the last decade 'big data' has become a buzzword used in several industrial sectors, including but not limited to telephony, finance and healthcare. Despite its popularity, it is not always clear what big data refers to exactly. Big data has become a very popular topic in healthcare, where the term primarily refers to the vast and growing volumes of computerized medical information available in the form of electronic health records, administrative or health claims data, disease and drug monitoring registries and so on. This kind of data is generally collected routinely during administrative processes and clinical practice by different healthcare professionals: from doctors recording their patients' medical history, drug prescriptions or medical claims to pharmacists registering dispensed prescriptions. For a long time, this data accumulated without its value being fully recognized and leveraged. Today big data has an important place in healthcare, including in pharmacovigilance. The expanding role of big data in pharmacovigilance includes signal detection, substantiation and validation of drug or vaccine safety signals, and increasingly new sources of information such as social media are also being considered. The aim of the present paper is to discuss the uses of big data for drug safety post-marketing assessment.
Collapse
Affiliation(s)
- Gianluca Trifirò
- Department of Biomedical and Dental Sciences and Morpho-Functional Imaging, University of Messina, Messina, Italy.
- Department of Medical Informatics, Erasmus Medical Centre, Rotterdam, The Netherlands.
| | - Janet Sultana
- Department of Biomedical and Dental Sciences and Morpho-Functional Imaging, University of Messina, Messina, Italy
- Department of Medical Informatics, Erasmus Medical Centre, Rotterdam, The Netherlands
| | - Andrew Bate
- Epidemiology Group Lead, Analytics, Worldwide Safety, Pfizer, Tadworth, UK
- Department of Clinical Pharmacology, New York University (NYU), New York, USA
| |
Collapse
|
31
|
The effectiveness of public health advertisements to promote health: a randomized-controlled trial on 794,000 participants. NPJ Digit Med 2018; 1:24. [PMID: 31304306 PMCID: PMC6550260 DOI: 10.1038/s41746-018-0031-7] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Revised: 03/30/2018] [Accepted: 04/09/2018] [Indexed: 11/13/2022] Open
Abstract
As public health advertisements move online, it becomes possible to run inexpensive randomized-controlled trials (RCTs) thereof. Here we report the results of an online RCT to improve food choices and integrate exercise into daily activities of internet users. People searching for pre-specified terms were randomized to receive one of several professionally developed campaign advertisements or the “status quo” (ads that would otherwise have been served). For 1-month pre-intervention and post-intervention, their searches for health-promoting goods or services were recorded. Our results show that 48% of people who were exposed to the ads made future searches for weight loss information, compared with 32% of those in the control group—a 50% increase. The advertisements varied in efficacy. However, the effectiveness of the advertisements may be greatly improved by targeting individuals based on their lifestyle preferences and/or sociodemographic characteristics, which together explain 49% of the variation in response to the ads. These results demonstrate that online advertisements hold promise as a mechanism for changing population health behaviors. They also provide researchers powerful ways to measure and improve the effectiveness of online public health interventions. Finally, we show that corporations that use these sophisticated tools to promote unhealthy products can potentially be outbid and outmaneuvered. People who see specific health-promoting messages after searching online for weight-related terms are more likely to subsequently search for information on weight loss interventions. A team led by Elad Yom-Tov from Microsoft Research Israel in Herzeliya conducted a randomized trial involving 794,000 users of the Bing search engine who queried terms related to weight, diet, and exercise. Randomly chosen subjects were shown advertisements designed to promote healthy living, while all other users were shown standard ads. The researchers found that 48% of people exposed to the health-promoting advertisements made searches within the next month for weight loss information, compared with only 32% of those in the control group. The findings suggest that targeted online messaging can help change population health behaviors.
Collapse
|
32
|
Tricco AC, Zarin W, Lillie E, Jeblee S, Warren R, Khan PA, Robson R, Pham B, Hirst G, Straus SE. Utility of social media and crowd-intelligence data for pharmacovigilance: a scoping review. BMC Med Inform Decis Mak 2018; 18:38. [PMID: 29898743 PMCID: PMC6001022 DOI: 10.1186/s12911-018-0621-y] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2017] [Accepted: 05/31/2018] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND A scoping review to characterize the literature on the use of conversations in social media as a potential source of data for detecting adverse events (AEs) related to health products. METHODS Our specific research questions were (1) What social media listening platforms exist to detect adverse events related to health products, and what are their capabilities and characteristics? (2) What is the validity and reliability of data from social media for detecting these adverse events? MEDLINE, EMBASE, Cochrane Library, and relevant websites were searched from inception to May 2016. Any type of document (e.g., manuscripts, reports) that described the use of social media data for detecting health product AEs was included. Two reviewers independently screened citations and full-texts, and one reviewer and one verifier performed data abstraction. Descriptive synthesis was conducted. RESULTS After screening 3631 citations and 321 full-texts, 70 unique documents with 7 companion reports available from 2001 to 2016 were included. Forty-six documents (66%) described an automated or semi-automated information extraction system to detect health product AEs from social media conversations (in the developmental phase). Seven pre-existing information extraction systems to mine social media data were identified in eight documents. Nineteen documents compared AEs reported in social media data with validated data and found consistent AE discovery in all except two documents. None of the documents reported the validity and reliability of the overall system, but some reported on the performance of individual steps in processing the data. The validity and reliability results were found for the following steps in the data processing pipeline: data de-identification (n = 1), concept identification (n = 3), concept normalization (n = 2), and relation extraction (n = 8). The methods varied widely, and some approaches yielded better results than others. CONCLUSIONS Our results suggest that the use of social media conversations for pharmacovigilance is in its infancy. Although social media data has the potential to supplement data from regulatory agency databases; is able to capture less frequently reported AEs; and can identify AEs earlier than official alerts or regulatory changes, the utility and validity of the data source remains under-studied. TRIAL REGISTRATION Open Science Framework ( https://osf.io/kv9hu/ ).
Collapse
Affiliation(s)
- Andrea C. Tricco
- Li Ka Shing Knowledge Institute of St. Michael’s Hospital, 209 Victoria Street, East Building, Toronto, ON M5B 1W8 Canada
- Epidemiology Division, Dalla Lana School of Public Health, University of Toronto, 6th Floor, 155 College St, Toronto, ON M5T 3M7 Canada
| | - Wasifa Zarin
- Li Ka Shing Knowledge Institute of St. Michael’s Hospital, 209 Victoria Street, East Building, Toronto, ON M5B 1W8 Canada
| | - Erin Lillie
- Li Ka Shing Knowledge Institute of St. Michael’s Hospital, 209 Victoria Street, East Building, Toronto, ON M5B 1W8 Canada
| | - Serena Jeblee
- Department of Computer Science, University of Toronto, 10 King’s College Road, Toronto, ON M5S 3G4 Canada
| | - Rachel Warren
- Li Ka Shing Knowledge Institute of St. Michael’s Hospital, 209 Victoria Street, East Building, Toronto, ON M5B 1W8 Canada
| | - Paul A. Khan
- Li Ka Shing Knowledge Institute of St. Michael’s Hospital, 209 Victoria Street, East Building, Toronto, ON M5B 1W8 Canada
| | - Reid Robson
- Li Ka Shing Knowledge Institute of St. Michael’s Hospital, 209 Victoria Street, East Building, Toronto, ON M5B 1W8 Canada
| | - Ba’ Pham
- Li Ka Shing Knowledge Institute of St. Michael’s Hospital, 209 Victoria Street, East Building, Toronto, ON M5B 1W8 Canada
| | - Graeme Hirst
- Department of Computer Science, University of Toronto, 10 King’s College Road, Toronto, ON M5S 3G4 Canada
| | - Sharon E. Straus
- Li Ka Shing Knowledge Institute of St. Michael’s Hospital, 209 Victoria Street, East Building, Toronto, ON M5B 1W8 Canada
- Department of Geriatric Medicine, Faculty of Medicine, University of Toronto, 27 Kings College Circle, Toronto, ON M5S 1A1 Canada
| |
Collapse
|
33
|
Zeraatkar K, Ahmadi M. Trends of infodemiology studies: a scoping review. Health Info Libr J 2018; 35:91-120. [PMID: 29729073 DOI: 10.1111/hir.12216] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2017] [Accepted: 03/17/2018] [Indexed: 12/15/2022]
Abstract
INTRODUCTION The health care industry is rich in data and information. Web technologies, such as search engines and social media, have provided an opportunity for the management of user generated data in real time in the form of infodemiology studies. The aim of this study was to investigate infodemiology studies conducted during 2002-2016, and compare them based on developed, developing and in transition countries. METHODS This scoping review was conducted in 2017 with the help of the PRISMA guidelines. PubMed, Scopus, Science Direct, Web of Knowledge, Google Scholar, Wiley and Springer databases were searched between the years 2002 and 2016. Finally, 56 articles were included in the review and analysed. RESULTS The initial infodemiology studies pertain to the quality assessment of the hospital's websites. Most of the studies were on developed countries, based on flu, and published in the Journal of Medical Internet Research. CONCLUSION The infodemiology approach provides unmatched opportunities for the management of health data and information generated by the users. Using this potential will provide unique opportunities for the health information need assessment in real time by health librarians and thereby provide evidence based health information to the people.
Collapse
Affiliation(s)
- Kimia Zeraatkar
- Department of Health Information Technology, Iran University of Medical Sciences, Tehran, Iran
| | - Maryam Ahmadi
- Department of Health Information Technology, Iran University of Medical Sciences, Tehran, Iran
| |
Collapse
|
34
|
MacKinlay A, Aamer H, Yepes AJ. Detection of Adverse Drug Reactions using Medical Named Entities on Twitter. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2018; 2017:1215-1224. [PMID: 29854190 PMCID: PMC5977585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Adverse Drug Reactions (ADRs) are unintentional reactions caused by a drug or combination of drugs taken by a patient. The current ADR reporting systems inevitably have delays in reporting such events. The broad scope of social media conversations on sites such as Twitter means that inevitably health-related topics will be covered. This means that these sites could then be used to detect potentially novel ADRs with less latency for subsequent further investigation. In this work, we investigate ADR surveillance using a large corpus of Twitter data, containing around 50 billion tweets spanning 3 years (2012-2014), and evaluate against over 3000 drugs reported in the FAERS database. This is both a larger corpus and broader selection of drugs than previous work in the domain. We compare the ADRs identified using our method to the FDA Adverse Event Reporting System (FAERS) database of ADRs reported using more traditional techniques, and find that Twitter is a useful resource for ADR detection up to 72% micro-averaged precision. Micro-averaged recall of 6% is achievable using only 10% of Twitter, indicating that with a higher-volume or targeted feed it would be possible to detect a large percentage of ADRs.
Collapse
|
35
|
Oren E, Frere J, Yom-Tov E, Yom-Tov E. Respiratory syncytial virus tracking using internet search engine data. BMC Public Health 2018; 18:445. [PMID: 29615018 PMCID: PMC5883276 DOI: 10.1186/s12889-018-5367-z] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Accepted: 03/22/2018] [Indexed: 01/25/2023] Open
Abstract
Background Respiratory Syncytial Virus (RSV) is the leading cause of hospitalization in children less than 1 year of age in the United States. Internet search engine queries may provide high resolution temporal and spatial data to estimate and predict disease activity. Methods After filtering an initial list of 613 symptoms using high-resolution Bing search logs, we used Google Trends data between 2004 and 2016 for a smaller list of 50 terms to build predictive models of RSV incidence for five states where long-term surveillance data was available. We then used domain adaptation to model RSV incidence for the 45 remaining US states. Results Surveillance data sources (hospitalization and laboratory reports) were highly correlated, as were laboratory reports with search engine data. The four terms which were most often statistically significantly correlated as time series with the surveillance data in the five state models were RSV, flu, pneumonia, and bronchiolitis. Using our models, we tracked the spread of RSV by observing the time of peak use of the search term in different states. In general, the RSV peak moved from south-east (Florida) to the north-west US. Conclusions Our study represents the first time that RSV has been tracked using Internet data results and highlights successful use of search filters and domain adaptation techniques, using data at multiple resolutions. Our approach may assist in identifying spread of both local and more widespread RSV transmission and may be applicable to other seasonal conditions where comprehensive epidemiological data is difficult to collect or obtain.
Collapse
Affiliation(s)
- Eyal Oren
- Division of Epidemiology & Biostatistics, Graduate School of Public Health, San Diego State University, San Diego, CA, USA. .,Department of Epidemiology & Biostatistics, University of Arizona College of Public Health, Tucson, AZ, USA.
| | - Justin Frere
- Department of Epidemiology & Biostatistics, University of Arizona College of Public Health, Tucson, AZ, USA
| | | | | |
Collapse
|
36
|
Giat E, Yom-Tov E. Evidence From Web-Based Dietary Search Patterns to the Role of B12 Deficiency in Non-Specific Chronic Pain: A Large-Scale Observational Study. J Med Internet Res 2018; 20:e4. [PMID: 29305340 PMCID: PMC5775484 DOI: 10.2196/jmir.8667] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Revised: 11/01/2017] [Accepted: 11/04/2017] [Indexed: 01/22/2023] Open
Abstract
Background Profound vitamin B12 deficiency is a known cause of disease, but the role of low or intermediate levels of B12 in the development of neuropathy and other neuropsychiatric symptoms, as well as the relationship between eating meat and B12 levels, is unclear. Objective The objective of our study was to investigate the role of low or intermediate levels of B12 in the development of neuropathy and other neuropsychiatric symptoms. Methods We used food-related Internet search patterns from a sample of 8.5 million people based in the US as a proxy for B12 intake and correlated these searches with Internet searches related to possible effects of B12 deficiency. Results Food-related search patterns were highly correlated with known consumption and food-related searches (ρ=.69). Awareness of B12 deficiency was associated with a higher consumption of B12-rich foods and with queries for B12 supplements. Searches for terms related to neurological disorders were correlated with searches for B12-poor foods, in contrast with control terms. Popular medicines, those having fewer indications, and those which are predominantly used to treat pain, were more strongly correlated with the ability to predict neuropathic pain queries using the B12 contents of food. Conclusions Our findings show that Internet search patterns are a useful way of investigating health questions in large populations, and suggest that low B12 intake may be associated with a broader spectrum of neurological disorders than previously thought.
Collapse
Affiliation(s)
- Eitan Giat
- Rheumatology Unit, The Autoimmune Center, Sheba Medical Center, Ramat Gan, Israel
| | | |
Collapse
|
37
|
Harpaz R, DuMouchel W, Schuemie M, Bodenreider O, Friedman C, Horvitz E, Ripple A, Sorbello A, White RW, Winnenburg R, Shah NH. Toward multimodal signal detection of adverse drug reactions. J Biomed Inform 2017; 76:41-49. [PMID: 29081385 PMCID: PMC8502488 DOI: 10.1016/j.jbi.2017.10.013] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2017] [Revised: 10/14/2017] [Accepted: 10/24/2017] [Indexed: 11/27/2022]
Abstract
OBJECTIVE Improving mechanisms to detect adverse drug reactions (ADRs) is key to strengthening post-marketing drug safety surveillance. Signal detection is presently unimodal, relying on a single information source. Multimodal signal detection is based on jointly analyzing multiple information sources. Building on, and expanding the work done in prior studies, the aim of the article is to further research on multimodal signal detection, explore its potential benefits, and propose methods for its construction and evaluation. MATERIAL AND METHODS Four data sources are investigated; FDA's adverse event reporting system, insurance claims, the MEDLINE citation database, and the logs of major Web search engines. Published methods are used to generate and combine signals from each data source. Two distinct reference benchmarks corresponding to well-established and recently labeled ADRs respectively are used to evaluate the performance of multimodal signal detection in terms of area under the ROC curve (AUC) and lead-time-to-detection, with the latter relative to labeling revision dates. RESULTS Limited to our reference benchmarks, multimodal signal detection provides AUC improvements ranging from 0.04 to 0.09 based on a widely used evaluation benchmark, and a comparative added lead-time of 7-22 months relative to labeling revision dates from a time-indexed benchmark. CONCLUSIONS The results support the notion that utilizing and jointly analyzing multiple data sources may lead to improved signal detection. Given certain data and benchmark limitations, the early stage of development, and the complexity of ADRs, it is currently not possible to make definitive statements about the ultimate utility of the concept. Continued development of multimodal signal detection requires a deeper understanding the data sources used, additional benchmarks, and further research on methods to generate and synthesize signals.
Collapse
Affiliation(s)
- Rave Harpaz
- Oracle Health Sciences, Bedford, MA, United States.
| | | | | | | | | | | | - Anna Ripple
- National Library of Medicine, NIH, Bethesda, MD, United States
| | | | | | | | - Nigam H Shah
- Stanford University, Stanford, CA, United States
| |
Collapse
|
38
|
Yom-Tov E, Lev-Ran S. Adverse Reactions Associated With Cannabis Consumption as Evident From Search Engine Queries. JMIR Public Health Surveill 2017; 3:e77. [PMID: 29074469 PMCID: PMC5680525 DOI: 10.2196/publichealth.8391] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Revised: 09/05/2017] [Accepted: 09/12/2017] [Indexed: 11/13/2022] Open
Abstract
Background Cannabis is one of the most widely used psychoactive substances worldwide, but adverse drug reactions (ADRs) associated with its use are difficult to study because of its prohibited status in many countries. Objective Internet search engine queries have been used to investigate ADRs in pharmaceutical drugs. In this proof-of-concept study, we tested whether these queries can be used to detect the adverse reactions of cannabis use. Methods We analyzed anonymized queries from US-based users of Bing, a widely used search engine, made over a period of 6 months and compared the results with the prevalence of cannabis use as reported in the US National Survey on Drug Use in the Household (NSDUH) and with ADRs reported in the Food and Drug Administration’s Adverse Drug Reporting System. Predicted prevalence of cannabis use was estimated from the fraction of people making queries about cannabis, marijuana, and 121 additional synonyms. Predicted ADRs were estimated from queries containing layperson descriptions to 195 ICD-10 symptoms list. Results Our results indicated that the predicted prevalence of cannabis use at the US census regional level reaches an R2 of .71 NSDUH data. Queries for ADRs made by people who also searched for cannabis reveal many of the known adverse effects of cannabis (eg, cough and psychotic symptoms), as well as plausible unknown reactions (eg, pyrexia). Conclusions These results indicate that search engine queries can serve as an important tool for the study of adverse reactions of illicit drugs, which are difficult to study in other settings.
Collapse
Affiliation(s)
| | - Shaul Lev-Ran
- Lev Hasharon Medical Center, Pardesya, Israel.,Sackler Faculty of Medicine, Tel-Aviv University, Tel-Aviv, Israel
| |
Collapse
|
39
|
Vasconcellos-Silva PR, Griep RH, de Souza MC. [Patterns of access to information on protection against UV during the Brazilian summer: is there such a thing as the "summer effect"?]. CIENCIA & SAUDE COLETIVA 2017. [PMID: 26221818 DOI: 10.1590/1413-81232015208.18932014] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Internet search patterns associated with "windows" of collective interest have been increasingly investigated in the field of public health. This article sets out to identify search patterns relating to the quest for information on skin protection after the perception of excessive exposure to UV radiation - the so-called "summer effect" as it is commonly referred to in Brazil. To calculate the number of hits on the Brazilian National Cancer Institute website - a renowned source of information resources on prevention - log analyzer software was used to measure the volume of hits on specific content pages. The pages on skin protection and self-examination (pages of interest) were monitored over a 48-month period. It was seen that, although the monthly average of hits on pages of interest revealed statistically significant annual growth, the results for the analysis of variance showed no significant differences between the number of hits in the summer compared with other months (p = 0.7491). In short, the perception of intense exposure to the summer sun did not encourage further interest to search for information on prevention.
Collapse
|
40
|
Combination of Deep Recurrent Neural Networks and Conditional Random Fields for Extracting Adverse Drug Reactions from User Reviews. JOURNAL OF HEALTHCARE ENGINEERING 2017; 2017:9451342. [PMID: 29177027 PMCID: PMC5605929 DOI: 10.1155/2017/9451342] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/14/2017] [Accepted: 07/27/2017] [Indexed: 01/30/2023]
Abstract
Adverse drug reactions (ADRs) are an essential part of the analysis of drug use, measuring drug use benefits, and making policy decisions. Traditional channels for identifying ADRs are reliable but very slow and only produce a small amount of data. Text reviews, either on specialized web sites or in general-purpose social networks, may lead to a data source of unprecedented size, but identifying ADRs in free-form text is a challenging natural language processing problem. In this work, we propose a novel model for this problem, uniting recurrent neural architectures and conditional random fields. We evaluate our model with a comprehensive experimental study, showing improvements over state-of-the-art methods of ADR extraction.
Collapse
|
41
|
|
42
|
Yom-Tov E. Predicting Drug Recalls From Internet Search Engine Queries. IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE-JTEHM 2017; 5:4400106. [PMID: 28845371 PMCID: PMC5568020 DOI: 10.1109/jtehm.2017.2732945] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/30/2016] [Revised: 05/21/2017] [Accepted: 07/23/2017] [Indexed: 01/01/2023]
Abstract
Batches of pharmaceuticals are sometimes recalled from the market when a safety issue or a defect is detected in specific production runs of a drug. Such problems are usually detected when patients or healthcare providers report abnormalities to medical authorities. Here, we test the hypothesis that defective production lots can be detected earlier by monitoring queries to Internet search engines. We extracted queries from the USA to the Bing search engine, which mentioned one of the 5195 pharmaceutical drugs during 2015 and all recall notifications issued by the Food and Drug Administration (FDA) during that year. By using attributes that quantify the change in query volume at the state level, we attempted to predict if a recall of a specific drug will be ordered by FDA in a time horizon ranging from 1 to 40 days in future. Our results show that future drug recalls can indeed be identified with an AUC of 0.791 and a lift at 5% of approximately 6 when predicting a recall occurring one day ahead. This performance degrades as prediction is made for longer periods ahead. The most indicative attributes for prediction are sudden spikes in query volume about a specific medicine in each state. Recalls of prescription drugs and those estimated to be of medium-risk are more likely to be identified using search query data. These findings suggest that aggregated Internet search engine data can be used to facilitate in early warning of faulty batches of medicines.
Collapse
|
43
|
Menachemi N, Rahurkar S, Rahurkar M. Using Web-Based Search Data to Study the Public's Reactions to Societal Events: The Case of the Sandy Hook Shooting. JMIR Public Health Surveill 2017; 3:e12. [PMID: 28336508 PMCID: PMC5383805 DOI: 10.2196/publichealth.6033] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2016] [Revised: 10/29/2016] [Accepted: 02/03/2017] [Indexed: 11/21/2022] Open
Abstract
Background Internet search is the most common activity on the World Wide Web and generates a vast amount of user-reported data regarding their information-seeking preferences and behavior. Although this data has been successfully used to examine outbreaks, health care utilization, and outcomes related to quality of care, its value in informing public health policy remains unclear. Objective The aim of this study was to evaluate the role of Internet search query data in health policy development. To do so, we studied the public’s reaction to a major societal event in the context of the 2012 Sandy Hook School shooting incident. Methods Query data from the Yahoo! search engine regarding firearm-related searches was analyzed to examine changes in user-selected search terms and subsequent websites visited for a period of 14 days before and after the shooting incident. Results A total of 5,653,588 firearm-related search queries were analyzed. In the after period, queries increased for search terms related to “guns” (+50.06%), “shooting incident” (+333.71%), “ammunition” (+155.14%), and “gun-related laws” (+535.47%). The highest increase (+1054.37%) in Web traffic was seen by news websites following “shooting incident” queries whereas searches for “guns” (+61.02%) and “ammunition” (+173.15%) resulted in notable increases in visits to retail websites. Firearm-related queries generally returned to baseline levels after approximately 10 days. Conclusions Search engine queries present a viable infodemiology metric on public reactions and subsequent behaviors to major societal events and could be used by policymakers to inform policy development.
Collapse
Affiliation(s)
- Nir Menachemi
- Richard M. Fairbanks School of Public HealthHealth Policy and ManagementIndiana University-IUPUIIndianapolis, INUnited States.,Regenstrief InstituteCenter for Biomedical InformaticsIndianapolis, INUnited States
| | - Saurabh Rahurkar
- Regenstrief InstituteCenter for Biomedical InformaticsIndianapolis, INUnited States
| | | |
Collapse
|
44
|
Validation of New Signal Detection Methods for Web Query Log Data Compared to Signal Detection Algorithms Used With FAERS. Drug Saf 2017; 40:399-408. [DOI: 10.1007/s40264-017-0507-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
45
|
Agarwal V, Zhang L, Zhu J, Fang S, Cheng T, Hong C, Shah NH. Impact of Predicting Health Care Utilization Via Web Search Behavior: A Data-Driven Analysis. J Med Internet Res 2016; 18:e251. [PMID: 27655225 PMCID: PMC5052461 DOI: 10.2196/jmir.6240] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2016] [Revised: 07/26/2016] [Accepted: 07/27/2016] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND By recent estimates, the steady rise in health care costs has deprived more than 45 million Americans of health care services and has encouraged health care providers to better understand the key drivers of health care utilization from a population health management perspective. Prior studies suggest the feasibility of mining population-level patterns of health care resource utilization from observational analysis of Internet search logs; however, the utility of the endeavor to the various stakeholders in a health ecosystem remains unclear. OBJECTIVE The aim was to carry out a closed-loop evaluation of the utility of health care use predictions using the conversion rates of advertisements that were displayed to the predicted future utilizers as a surrogate. The statistical models to predict the probability of user's future visit to a medical facility were built using effective predictors of health care resource utilization, extracted from a deidentified dataset of geotagged mobile Internet search logs representing searches made by users of the Baidu search engine between March 2015 and May 2015. METHODS We inferred presence within the geofence of a medical facility from location and duration information from users' search logs and putatively assigned medical facility visit labels to qualifying search logs. We constructed a matrix of general, semantic, and location-based features from search logs of users that had 42 or more search days preceding a medical facility visit as well as from search logs of users that had no medical visits and trained statistical learners for predicting future medical visits. We then carried out a closed-loop evaluation of the utility of health care use predictions using the show conversion rates of advertisements displayed to the predicted future utilizers. In the context of behaviorally targeted advertising, wherein health care providers are interested in minimizing their cost per conversion, the association between show conversion rate and predicted utilization score, served as a surrogate measure of the model's utility. RESULTS We obtained the highest area under the curve (0.796) in medical visit prediction with our random forests model and daywise features. Ablating feature categories one at a time showed that the model performance worsened the most when location features were dropped. An online evaluation in which advertisements were served to users who had a high predicted probability of a future medical visit showed a 3.96% increase in the show conversion rate. CONCLUSIONS Results from our experiments done in a research setting suggest that it is possible to accurately predict future patient visits from geotagged mobile search logs. Results from the offline and online experiments on the utility of health utilization predictions suggest that such prediction can have utility for health care providers.
Collapse
Affiliation(s)
- Vibhu Agarwal
- Biomedical Informatics Training Program, Stanford University, Stanford, CA, United States.
| | | | | | | | | | | | | |
Collapse
|
46
|
Sharma V, Holmes JH, Sarkar IN. Identifying Complementary and Alternative Medicine Usage Information from Internet Resources. A Systematic Review. Methods Inf Med 2016; 55:322-32. [PMID: 27352304 PMCID: PMC4975632 DOI: 10.3414/me15-01-0154] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2015] [Accepted: 04/25/2016] [Indexed: 02/03/2023]
Abstract
OBJECTIVES Identify and highlight research issues and methods used in studying Complementary and Alternative Medicine (CAM) information needs, access, and exchange over the Internet. METHODS A literature search was conducted using Preferred Reporting Items for Systematic Reviews and Meta-Analysis guidelines from PubMed to identify articles that have studied Internet use in the CAM context. Additional searches were conducted at Nature.com and Google Scholar. RESULTS The Internet provides a major medium for attaining CAM information and can also serve as an avenue for conducting CAM related surveys. Based on the literature analyzed in this review, there seems to be significant interest in developing methodologies for identifying CAM treatments, including the analysis of search query data and social media platform discussions. Several studies have also underscored the challenges in developing approaches for identifying the reliability of CAM-related information on the Internet, which may not be supported with reliable sources. The overall findings of this review suggest that there are opportunities for developing approaches for making available accurate information and developing ways to restrict the spread and sale of potentially harmful CAM products and information. CONCLUSIONS Advances in Internet research are yet to be used in context of understanding CAM prevalence and perspectives. Such approaches may provide valuable insights into the current trends and needs in context of CAM use and spread.
Collapse
Affiliation(s)
| | | | - Indra N Sarkar
- Indra Neil Sarkar, Ph.D., MLIS, Center for Biomedical Informatics, Brown University, Box G-R, Providence, RI 02912, USA, E-mail:
| |
Collapse
|
47
|
Differences in physical status, mental state and online behavior of people in pro-anorexia web communities. Eat Behav 2016; 22:109-112. [PMID: 27183245 DOI: 10.1016/j.eatbeh.2016.05.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/02/2015] [Revised: 04/07/2016] [Accepted: 05/09/2016] [Indexed: 11/23/2022]
Abstract
BACKGROUND There is a debate about the effects of pro-anorexia (colloquially referred to as pro-ana) websites. Research suggests that the effect of these websites is not straightforward. Indeed, the actual function of these sites is disputed, with studies indicating both negative and positive effects. AIM This is the first study which systematically examined the differences between pro-anorexia web communities in four main aspects: web language used (posts); web interests/search behaviors (queries); users' self-reported weight status and weight goals; and associated self-reported mood/pathology. METHODS We collected three primary sources of data, including messages posed on three pro-ana websites, a survey completed by over 1000 participants of a pro-ana website, and the searches made on the Bing search engine of pro-anorexia users. These data were analyzed for content, reported demographics and pathology, and behavior over time. RESULTS Although members of the main pro-ana website investigated appear to be depressed, with high rates of self-harm and suicide attempts, users are significantly more interested in treatment, have wishes of procreation and reported the highest goal weights among the investigated sites. In contrast, users of other pro-ana websites investigated, are more interested in morbid themes including depression, self-harm and suicide. The percentage of severely malnourished website users, in general, appears to be small (20%). CONCLUSIONS Our results indicate that a new strategy is required to facilitate the communication between mental health specialists and pro-ana web users, recognizing the differences in harm associated with different websites.
Collapse
|
48
|
Banda JM, Evans L, Vanguri RS, Tatonetti NP, Ryan PB, Shah NH. A curated and standardized adverse drug event resource to accelerate drug safety research. Sci Data 2016; 3:160026. [PMID: 27193236 PMCID: PMC4872271 DOI: 10.1038/sdata.2016.26] [Citation(s) in RCA: 120] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2015] [Accepted: 03/24/2016] [Indexed: 11/08/2022] Open
Abstract
Identification of adverse drug reactions (ADRs) during the post-marketing phase is one of the most important goals of drug safety surveillance. Spontaneous reporting systems (SRS) data, which are the mainstay of traditional drug safety surveillance, are used for hypothesis generation and to validate the newer approaches. The publicly available US Food and Drug Administration (FDA) Adverse Event Reporting System (FAERS) data requires substantial curation before they can be used appropriately, and applying different strategies for data cleaning and normalization can have material impact on analysis results. We provide a curated and standardized version of FAERS removing duplicate case records, applying standardized vocabularies with drug names mapped to RxNorm concepts and outcomes mapped to SNOMED-CT concepts, and pre-computed summary statistics about drug-outcome relationships for general consumption. This publicly available resource, along with the source code, will accelerate drug safety research by reducing the amount of time spent performing data management on the source FAERS reports, improving the quality of the underlying data, and enabling standardized analyses using common vocabularies.
Collapse
Affiliation(s)
- Juan M. Banda
- Center for Biomedical Informatics Research, Stanford University, Stanford, California 94305, USA
| | - Lee Evans
- LTS Computing LLC, West Chester, Pennsylvania 19380, USA
| | - Rami S. Vanguri
- Department of Biomedical Informatics, Columbia University, New York, New York 10032, USA
| | - Nicholas P. Tatonetti
- Department of Biomedical Informatics, Columbia University, New York, New York 10032, USA
| | - Patrick B. Ryan
- Janssen Research & Development, LLC, Titusville, New Jersey 08869, USA
| | - Nigam H. Shah
- Center for Biomedical Informatics Research, Stanford University, Stanford, California 94305, USA
| |
Collapse
|
49
|
Hodos RA, Kidd BA, Khader S, Readhead BP, Dudley JT. In silico methods for drug repurposing and pharmacology. WILEY INTERDISCIPLINARY REVIEWS. SYSTEMS BIOLOGY AND MEDICINE 2016; 8:186-210. [PMID: 27080087 PMCID: PMC4845762 DOI: 10.1002/wsbm.1337] [Citation(s) in RCA: 181] [Impact Index Per Article: 22.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Revised: 02/08/2016] [Accepted: 02/11/2016] [Indexed: 12/18/2022]
Abstract
Data in the biological, chemical, and clinical domains are accumulating at ever-increasing rates and have the potential to accelerate and inform drug development in new ways. Challenges and opportunities now lie in developing analytic tools to transform these often complex and heterogeneous data into testable hypotheses and actionable insights. This is the aim of computational pharmacology, which uses in silico techniques to better understand and predict how drugs affect biological systems, which can in turn improve clinical use, avoid unwanted side effects, and guide selection and development of better treatments. One exciting application of computational pharmacology is drug repurposing-finding new uses for existing drugs. Already yielding many promising candidates, this strategy has the potential to improve the efficiency of the drug development process and reach patient populations with previously unmet needs such as those with rare diseases. While current techniques in computational pharmacology and drug repurposing often focus on just a single data modality such as gene expression or drug-target interactions, we argue that methods such as matrix factorization that can integrate data within and across diverse data types have the potential to improve predictive performance and provide a fuller picture of a drug's pharmacological action. WIREs Syst Biol Med 2016, 8:186-210. doi: 10.1002/wsbm.1337 For further resources related to this article, please visit the WIREs website.
Collapse
Affiliation(s)
- Rachel A Hodos
- New York University and Icahn School of Medicine at Mt. Sinai, New York, NY
| | - Brian A Kidd
- Icahn School of Medicine at Mt. Sinai, New York, NY
| | | | | | | |
Collapse
|
50
|
White RW, Wang S, Pant A, Harpaz R, Shukla P, Sun W, DuMouchel W, Horvitz E. Early identification of adverse drug reactions from search log data. J Biomed Inform 2016; 59:42-8. [DOI: 10.1016/j.jbi.2015.11.005] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2015] [Revised: 11/07/2015] [Accepted: 11/12/2015] [Indexed: 01/28/2023]
|