1
|
Lossio-Ventura JA, Weger R, Lee AY, Guinee EP, Chung J, Atlas L, Linos E, Pereira F. A Comparison of ChatGPT and Fine-Tuned Open Pre-Trained Transformers (OPT) Against Widely Used Sentiment Analysis Tools: Sentiment Analysis of COVID-19 Survey Data. JMIR Ment Health 2024; 11:e50150. [PMID: 38271138 PMCID: PMC10813836 DOI: 10.2196/50150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 11/16/2023] [Accepted: 11/17/2023] [Indexed: 01/27/2024] Open
Abstract
BACKGROUND Health care providers and health-related researchers face significant challenges when applying sentiment analysis tools to health-related free-text survey data. Most state-of-the-art applications were developed in domains such as social media, and their performance in the health care context remains relatively unknown. Moreover, existing studies indicate that these tools often lack accuracy and produce inconsistent results. OBJECTIVE This study aims to address the lack of comparative analysis on sentiment analysis tools applied to health-related free-text survey data in the context of COVID-19. The objective was to automatically predict sentence sentiment for 2 independent COVID-19 survey data sets from the National Institutes of Health and Stanford University. METHODS Gold standard labels were created for a subset of each data set using a panel of human raters. We compared 8 state-of-the-art sentiment analysis tools on both data sets to evaluate variability and disagreement across tools. In addition, few-shot learning was explored by fine-tuning Open Pre-Trained Transformers (OPT; a large language model [LLM] with publicly available weights) using a small annotated subset and zero-shot learning using ChatGPT (an LLM without available weights). RESULTS The comparison of sentiment analysis tools revealed high variability and disagreement across the evaluated tools when applied to health-related survey data. OPT and ChatGPT demonstrated superior performance, outperforming all other sentiment analysis tools. Moreover, ChatGPT outperformed OPT, exhibited higher accuracy by 6% and higher F-measure by 4% to 7%. CONCLUSIONS This study demonstrates the effectiveness of LLMs, particularly the few-shot learning and zero-shot learning approaches, in the sentiment analysis of health-related survey data. These results have implications for saving human labor and improving efficiency in sentiment analysis tasks, contributing to advancements in the field of automated sentiment analysis.
Collapse
Affiliation(s)
| | - Rachel Weger
- School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - Angela Y Lee
- Department of Communication, Stanford University, Stanford, CA, United States
| | - Emily P Guinee
- National Institute of Mental Health, National Institutes of Health, Bethesda, MD, United States
| | - Joyce Chung
- National Institute of Mental Health, National Institutes of Health, Bethesda, MD, United States
| | - Lauren Atlas
- National Center For Complementary and Alternative Medicine, National Institutes of Health, Bethesda, MD, United States
| | - Eleni Linos
- School of Medicine, Stanford University, Stanford, CA, United States
| | - Francisco Pereira
- National Institute of Mental Health, National Institutes of Health, Bethesda, MD, United States
| |
Collapse
|
2
|
Kim K. Scanned information exposure and support for tobacco regulations among US youth and young adult tobacco product users and non-users. HEALTH EDUCATION RESEARCH 2023; 38:426-444. [PMID: 37565566 PMCID: PMC10516358 DOI: 10.1093/her/cyad033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 06/20/2023] [Accepted: 07/14/2023] [Indexed: 08/12/2023]
Abstract
The influences of information exposure on youth and young adults' (YYA) support for smoking/vaping regulations have been understudied. This study examines (i) the relationships between routine exposure to (i.e. scanning) anti-smoking/pro-vaping information and YYA support for anti-smoking/vaping regulations and (ii) whether these relationships differ across YYA users and non-users of tobacco products. We analyzed the data from a nationally representative two-wave rolling cross-sectional survey of YYA in the United States, collected from 2014 to 2017 (baseline n = 10 642; follow-up n = 4001). Less than 5% of the participants ever scanned pro-smoking and anti-vaping information. Scanning anti-smoking information had significant positive relationships with support for all anti-smoking policies cross-sectionally, and this pattern was longitudinally significant in two anti-smoking policy contexts. Scanning pro-vaping information had significant negative associations with support for anti-vaping policies cross-sectionally, but not longitudinally. The lagged positive relationships between scanning anti-smoking information and support for anti-smoking regulations were stronger among YYA smokers than among YYA non-smokers, whereas evidence from adult data suggested the opposite pattern. The findings suggest that scanning information can affect YYA support for tobacco regulations. Future efforts are required to investigate mechanisms underlying the influences of scanned information on YYA support for tobacco regulations.
Collapse
Affiliation(s)
- Kwanho Kim
- Department of Communication, Cornell University, 494 Mann Library Building, Ithaca, NY 14853, USA
| |
Collapse
|
3
|
Amin S, Jaiswal A, Washington PY, Pokhrel P. Investigating #vapingcessation in Twitter. RESEARCH SQUARE 2023:rs.3.rs-2976095. [PMID: 37333241 PMCID: PMC10275054 DOI: 10.21203/rs.3.rs-2976095/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]
Abstract
Evidence suggests that an increasing number of e-cigarette users report intentions and attempts to quit vaping. Since exposure to e-cigarette-related content on social media may influence e-cigarette and other tobacco product use, including potentially e-cigarette cessation, we aimed to explore vaping cessation-related posts on Twitter by utilizing a mixed-methods approach. We collected tweets pertaining to vaping cessation for the time period between January 2022 and December 2022 using snscrape. Tweets were scraped for the following hashtags: #vapingcessation, #quitvaping, and #stopJuuling. Data were analysed using Azure Machine Learning and Nvivo 12 software. Sentiment analysis revealed that vaping cessation-related tweets typically embody positive sentiment and are mostly produced in the U.S. and Australia. Our qualitative analysis identified six emerging themes: vaping cessation support, promotion of vaping cessation, barriers and benefits to vaping cessation, personal vaping cessation, and usefulness of peer support for vaping cessation. Our findings imply that improved dissemination of evidence-based vaping cessation strategies to a broad audience through Twitter may promote vaping cessation at the population level.
Collapse
|
4
|
Elkaim LM, Levett JJ, Niazi F, Alvi MA, Shlobin NA, Linzey JR, Robertson F, Bokhari R, Alotaibi NM, Lasry O. Cervical Myelopathy and Social Media: Mixed Methods Analysis. J Med Internet Res 2023; 25:e42097. [PMID: 37213188 DOI: 10.2196/42097] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 04/15/2023] [Accepted: 04/17/2023] [Indexed: 05/23/2023] Open
Abstract
BACKGROUND Degenerative cervical myelopathy (DCM) is a progressive neurologic condition caused by age-related degeneration of the cervical spine. Social media has become a crucial part of many patients' lives; however, little is known about social media use pertaining to DCM. OBJECTIVE This manuscript describes the landscape of social media use and DCM in patients, caretakers, clinicians, and researchers. METHODS A comprehensive search of the entire Twitter application programing interface database from inception to March 2022 was performed to identify all tweets about cervical myelopathy. Data on Twitter users included geographic location, number of followers, and number of tweets. The number of tweet likes, retweets, quotes, and total engagement were collected. Tweets were also categorized based on their underlying themes. Mentions pertaining to past or upcoming surgical procedures were recorded. A natural language processing algorithm was used to assign a polarity score, subjectivity score, and analysis label to each tweet for sentiment analysis. RESULTS Overall, 1859 unique tweets from 1769 accounts met the inclusion criteria. The highest frequency of tweets was seen in 2018 and 2019, and tweets decreased significantly in 2020 and 2021. Most (888/1769, 50.2%) of the tweets' authors were from the United States, United Kingdom, or Canada. Account categorization showed that 668 of 1769 (37.8%) users discussing DCM on Twitter were medical doctors or researchers, 415 of 1769 (23.5%) were patients or caregivers, and 201 of 1769 (11.4%) were news media outlets. The 1859 tweets most often discussed research (n=761, 40.9%), followed by spreading awareness or informing the public on DCM (n=559, 30.1%). Tweets describing personal patient perspectives on living with DCM were seen in 296 (15.9%) posts, with 65 (24%) of these discussing upcoming or past surgical experiences. Few tweets were related to advertising (n=31, 1.7%) or fundraising (n=7, 0.4%). A total of 930 (50%) tweets included a link, 260 (14%) included media (ie, photos or videos), and 595 (32%) included a hashtag. Overall, 847 of the 1859 tweets (45.6%) were classified as neutral, 717 (38.6%) as positive, and 295 (15.9%) as negative. CONCLUSIONS When categorized thematically, most tweets were related to research, followed by spreading awareness or informing the public on DCM. Almost 25% (65/296) of tweets describing patients' personal experiences with DCM discussed past or upcoming surgical interventions. Few posts pertained to advertising or fundraising. These data can help identify areas for improvement of public awareness online, particularly regarding education, support, and fundraising.
Collapse
Affiliation(s)
- Lior M Elkaim
- Department of Neurology and Neurosurgery, McGill University, Montreal, QC, Canada
| | - Jordan J Levett
- Department of Medicine, University of Montreal, Montreal, QC, Canada
| | - Farbod Niazi
- Department of Medicine, University of Montreal, Montreal, QC, Canada
| | - Mohammed A Alvi
- Division of Neurosurgery, Department of Surgery, University of Toronto, Toronto, ON, Canada
| | - Nathan A Shlobin
- Department of Neurological Surgery, Feinberg School of Medicine, Northwestern University, Chicago, IL, United States
| | - Joseph R Linzey
- Department of Neurosurgery, University of Michigan, Detroit, MI, United States
| | - Faith Robertson
- Department of Neurosurgery, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States
| | - Rakan Bokhari
- Department of Neurology and Neurosurgery, McGill University, Montreal, QC, Canada
| | - Naif M Alotaibi
- National Neuroscience Institute, King Fahad Medical City, Riyadh, Saudi Arabia
| | - Oliver Lasry
- Department of Neurology and Neurosurgery, McGill University, Montreal, QC, Canada
| |
Collapse
|
5
|
Fu R, Kundu A, Mitsakakis N, Elton-Marshall T, Wang W, Hill S, Bondy SJ, Hamilton H, Selby P, Schwartz R, Chaiton MO. Machine learning applications in tobacco research: a scoping review. Tob Control 2023; 32:99-109. [PMID: 34452986 DOI: 10.1136/tobaccocontrol-2020-056438] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Accepted: 04/14/2021] [Indexed: 12/23/2022]
Abstract
OBJECTIVE Identify and review the body of tobacco research literature that self-identified as using machine learning (ML) in the analysis. DATA SOURCES MEDLINE, EMABSE, PubMed, CINAHL Plus, APA PsycINFO and IEEE Xplore databases were searched up to September 2020. Studies were restricted to peer-reviewed, English-language journal articles, dissertations and conference papers comprising an empirical analysis where ML was identified to be the method used to examine human experience of tobacco. Studies of genomics and diagnostic imaging were excluded. STUDY SELECTION Two reviewers independently screened the titles and abstracts. The reference list of articles was also searched. In an iterative process, eligible studies were classified into domains based on their objectives and types of data used in the analysis. DATA EXTRACTION Using data charting forms, two reviewers independently extracted data from all studies. A narrative synthesis method was used to describe findings from each domain such as study design, objective, ML classes/algorithms, knowledge users and the presence of a data sharing statement. Trends of publication were visually depicted. DATA SYNTHESIS 74 studies were grouped into four domains: ML-powered technology to assist smoking cessation (n=22); content analysis of tobacco on social media (n=32); smoker status classification from narrative clinical texts (n=6) and tobacco-related outcome prediction using administrative, survey or clinical trial data (n=14). Implications of these studies and future directions for ML researchers in tobacco control were discussed. CONCLUSIONS ML represents a powerful tool that could advance the research and policy decision-making of tobacco control. Further opportunities should be explored.
Collapse
Affiliation(s)
- Rui Fu
- Institute of Health Policy Management and Evaluation, University of Toronto, Toronto, Ontario, Canada
| | - Anasua Kundu
- Ontario Tobacco Research Unit, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
| | - Nicholas Mitsakakis
- Institute of Health Policy Management and Evaluation, University of Toronto, Toronto, Ontario, Canada
- Children's Hospital of Eastern Ontario Research Institute, Ottawa, Ontario, Canada
| | - Tara Elton-Marshall
- Institute for Mental Health Policy Research, Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| | - Wei Wang
- Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| | - Sean Hill
- Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| | - Susan J Bondy
- Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| | - Hayley Hamilton
- Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| | - Peter Selby
- Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| | - Robert Schwartz
- Ontario Tobacco Research Unit, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
- Institute for Mental Health Policy Research, Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| | - Michael Oliver Chaiton
- Ontario Tobacco Research Unit, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
- Institute for Mental Health Policy Research, Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| |
Collapse
|
6
|
Hassan L, Elkaref M, de Mel G, Bogdanovica I, Nenadic G. Text mining tweets on e-cigarette risks and benefits using machine learning following a vaping related lung injury outbreak in the USA. HEALTHCARE ANALYTICS (NEW YORK, N.Y.) 2022; 2:None. [PMID: 36605918 PMCID: PMC9801957 DOI: 10.1016/j.health.2022.100066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 04/28/2022] [Accepted: 05/12/2022] [Indexed: 01/07/2023]
Abstract
Electronic nicotine delivery systems (ENDS) (also known as 'e-cigarettes') can support smoking cessation, although the long-term health impacts are not yet known. In 2019, a cluster of lung injury cases in the USA emerged that were ostensibly associated with ENDS use. Subsequent investigations revealed a link with vitamin E acetate, an additive used in some ENDS liquid products containing tetrahydrocannabinol (THC). This became known as the EVALI (E-cigarette or Vaping product use Associated Lung Injury) outbreak. While few cases were reported in the UK, the EVALI outbreak intensified attention on ENDS in general worldwide. We aimed to describe and explore public commentary and discussion on Twitter immediately before, during and following the peak of the EVALI outbreak using text mining techniques. Specifically, topic modelling, operationalised using Latent Dirichlet Allocation (LDA) models, was used to discern discussion topics in 189,658 tweets about ENDS (collected April-December 2019). Individual tweets and Twitter users were assigned to their dominant topics and countries respectively to enable international comparisons. A 10-topic LDA model fit the data best. We organised the ten topics into three broad themes for the purposes of reporting: informal vaping discussion; vaping policy discussion and EVALI news; and vaping commerce. Following EVALI, there were signs that informal vaping discussion topics decreased while discussion topics about vaping policy and the relative health risks and benefits of ENDS increased, not limited to THC products. Though subsequently attributed to THC products, the EVALI outbreak disrupted online public discourses about ENDS generally, amplifying health and policy commentary. There was a relatively stronger presence of commercially oriented tweets among UK Twitter users compared to USA users.
Collapse
Affiliation(s)
- Lamiece Hassan
- Division of Informatics, Imaging and Data Sciences, The University of Manchester, UK
| | | | | | | | - Goran Nenadic
- School of Computer Science, The University of Manchester, UK
| |
Collapse
|
7
|
Rahim AIA, Ibrahim MI, Chua SL, Musa KI. Hospital Facebook Reviews Analysis Using a Machine Learning Sentiment Analyzer and Quality Classifier. Healthcare (Basel) 2021; 9:1679. [PMID: 34946405 PMCID: PMC8701188 DOI: 10.3390/healthcare9121679] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 11/30/2021] [Accepted: 12/02/2021] [Indexed: 02/05/2023] Open
Abstract
While experts have recognised the significance and necessity of social media integration in healthcare, no systematic method has been devised in Malaysia or Southeast Asia to include social media input into the hospital quality improvement process. The goal of this work is to explain how to develop a machine learning system for classifying Facebook reviews of public hospitals in Malaysia by using service quality (SERVQUAL) dimensions and sentiment analysis. We developed a Machine Learning Quality Classifier (MLQC) based on the SERVQUAL model and a Machine Learning Sentiment Analyzer (MLSA) by manually annotated multiple batches of randomly chosen reviews. Logistic regression (LR), naive Bayes (NB), support vector machine (SVM), and other methods were used to train the classifiers. The performance of each classifier was tested using 5-fold cross validation. For topic classification, the average F1-score was between 0.687 and 0.757 for all models. In a 5-fold cross validation of each SERVQUAL dimension and in sentiment analysis, SVM consistently outperformed other methods. The study demonstrates how to use supervised learning to automatically identify SERVQUAL domains and sentiments from patient experiences on a hospital's Facebook page. Malaysian healthcare providers can gather and assess data on patient care via the use of these content analysis technology to improve hospital quality of care.
Collapse
Affiliation(s)
- Afiq Izzudin A. Rahim
- Department of Community Medicine, School of Medical Science, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Kelantan, Malaysia; (A.I.A.R.); (K.I.M.)
| | - Mohd Ismail Ibrahim
- Department of Community Medicine, School of Medical Science, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Kelantan, Malaysia; (A.I.A.R.); (K.I.M.)
| | - Sook-Ling Chua
- Faculty of Computing and Informatics, Multimedia University, Persiaran Multimedia, Cyberjaya 63100, Selangor, Malaysia
| | - Kamarul Imran Musa
- Department of Community Medicine, School of Medical Science, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Kelantan, Malaysia; (A.I.A.R.); (K.I.M.)
| |
Collapse
|
8
|
Haupt MR, Xu Q, Yang J, Cai M, Mackey TK. Characterizing Vaping Industry Political Influence and Mobilization on Facebook: Social Network Analysis. J Med Internet Res 2021; 23:e28069. [PMID: 34714245 PMCID: PMC8590191 DOI: 10.2196/28069] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Revised: 05/01/2021] [Accepted: 08/04/2021] [Indexed: 11/13/2022] Open
Abstract
Background In response to recent policy efforts to regulate tobacco and vaping products, the vaping industry has been aggressive in mobilizing opposition by using a network of manufacturers, trade associations, and tobacco user communities, and by appealing to the general public. One strategy the alternative tobacco industry uses to mobilize political action is coordinating on social media platforms, such as the social networking site Facebook. However, few studies have specifically assessed how platforms such as Facebook are used to influence public sentiment and attitudes towards tobacco control policy. Objective This study used social network analysis to examine how the alternative tobacco industry uses Facebook to mobilize online users to influence tobacco control policy outcomes with a focus on the state of California. Methods Data were collected from local and national alternative tobacco Facebook groups that had affiliations with activities in the state of California. Network ties were constructed based on users’ reactions to posts (eg, “like” and “love”) and comments to characterize political mobilization networks. Results Findings show that alternative tobacco industry employees were more likely to engage within these networks and that these employees were also more likely to be influential members (ie, be more active) in the network. Comparisons between subnetworks show that communication within the local alternative tobacco advocacy group network was less dense and more centralized in contrast to a national advocacy group that had overall higher levels of engagement among members. A timeline analysis found that a higher number of influential posts that disseminated widely across networks occurred during e-cigarette–related legislative events, suggesting strategic online engagement and increased mobilization of online activity for the purposes of influencing policy outcomes. Conclusions Results from this study provide important insights into how tobacco industry–related advocacy groups leverage the Facebook platform to mobilize their online constituents in an effort to influence public perceptions and coordinate to defeat tobacco control efforts at the local, state, and federal level. Study results reveal one part of a vast network of socially enabled alternative tobacco industry actors and constituents that use Facebook as a mobilization point to support goals of the alternative tobacco industry.
Collapse
Affiliation(s)
- Michael Robert Haupt
- Department of Cognitive Science, University of California, San Diego, La Jolla, CA, United States.,Global Health Policy and Data Institute, San Diego, CA, United States
| | - Qing Xu
- Global Health Policy and Data Institute, San Diego, CA, United States.,Department of Healthcare Research and Policy, University of California, San Diego Extension, La Jolla, CA, United States.,S-3 Research, San Diego, CA, United States
| | - Joshua Yang
- Department of Public Health, California State University, Fullerton, CA, United States
| | - Mingxiang Cai
- Global Health Policy and Data Institute, San Diego, CA, United States.,Department of Healthcare Research and Policy, University of California, San Diego Extension, La Jolla, CA, United States.,S-3 Research, San Diego, CA, United States
| | - Tim K Mackey
- Global Health Policy and Data Institute, San Diego, CA, United States.,S-3 Research, San Diego, CA, United States.,Global Health Program, Department of Anthropology, University of California, San Diego, La Jolla, CA, United States
| |
Collapse
|
9
|
Xu Q, Yang J, Haupt MR, Cai M, Nali MC, Mackey TK. Digital Surveillance to Identify California Alternative and Emerging Tobacco Industry Policy Influence and Mobilization on Facebook. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:ijerph182111150. [PMID: 34769666 PMCID: PMC8583030 DOI: 10.3390/ijerph182111150] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 10/18/2021] [Accepted: 10/19/2021] [Indexed: 11/16/2022]
Abstract
Growing popularity of electronic nicotine-delivery systems (ENDS) has coincided with a need to strengthen tobacco-control policy. In response, the ENDS industry has taken actions to mobilize against public health measures, including coordination on social media platforms. To explore this phenomenon, data mining was used to collect public posts on two Facebook public group pages: the California Consumer Advocates for Smoke Free Alternatives Association (CCASAA) and the community page of the Northern California Chapter of SFATA (NC-SFATA). Posts were manually annotated to characterize themes associated with industry political interference and user interaction. We collected 288 posts from the NC-SFATA and 411 posts from CCASAA. A total of 522 (74.7%) posts were categorized as a form of political interference, with 339 posts (64.9%) from CCASAA and 183 posts (35.1%) from NC-SFATA. We identified three different categories of policy interference-related posts: (1) providing updates on ENDS-related policy at the federal, state, and local levels; (2) sharing opinions about ENDS-related policies; (3) posts related to scientific information related to vaping; and (4) calls to action to mobilize against tobacco/ENDS policies. Our findings indicate that pro-tobacco social media communities on Facebook, driven by strategic activities of trade associations and their members, may act as focal points for anti-policy information dissemination, grass-roots mobilization, and industry coordination that needs further research.
Collapse
Affiliation(s)
- Qing Xu
- Global Health Policy and Data Institute, San Diego, CA 92121, USA; (Q.X.); (M.R.H.); (M.C.); (M.C.N.)
- S-3 Research, LLC, San Diego, CA 92121, USA
| | - Joshua Yang
- Department of Public Health, California State University, Fullerton, CA 92834, USA;
| | - Michael R. Haupt
- Global Health Policy and Data Institute, San Diego, CA 92121, USA; (Q.X.); (M.R.H.); (M.C.); (M.C.N.)
- Department of Cognitive Science, University of California, San Diego, CA 92093, USA
| | - Mingxiang Cai
- Global Health Policy and Data Institute, San Diego, CA 92121, USA; (Q.X.); (M.R.H.); (M.C.); (M.C.N.)
- S-3 Research, LLC, San Diego, CA 92121, USA
- Global Health Program, Department of Anthropology, University of California, San Diego, CA 92093, USA
| | - Matthew C. Nali
- Global Health Policy and Data Institute, San Diego, CA 92121, USA; (Q.X.); (M.R.H.); (M.C.); (M.C.N.)
- S-3 Research, LLC, San Diego, CA 92121, USA
- Global Health Program, Department of Anthropology, University of California, San Diego, CA 92093, USA
| | - Tim K. Mackey
- Global Health Policy and Data Institute, San Diego, CA 92121, USA; (Q.X.); (M.R.H.); (M.C.); (M.C.N.)
- S-3 Research, LLC, San Diego, CA 92121, USA
- Global Health Program, Department of Anthropology, University of California, San Diego, CA 92093, USA
- Correspondence: ; Tel.: +1-(951)-491-4161
| |
Collapse
|
10
|
Rahim AIA, Ibrahim MI, Musa KI, Chua SL, Yaacob NM. Patient Satisfaction and Hospital Quality of Care Evaluation in Malaysia Using SERVQUAL and Facebook. Healthcare (Basel) 2021; 9:1369. [PMID: 34683050 PMCID: PMC8544585 DOI: 10.3390/healthcare9101369] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2021] [Revised: 09/27/2021] [Accepted: 10/12/2021] [Indexed: 02/05/2023] Open
Abstract
Social media sites, dubbed patient online reviews (POR), have been proposed as new methods for assessing patient satisfaction and monitoring quality of care. However, the unstructured nature of POR data derived from social media creates a number of challenges. The objectives of this research were to identify service quality (SERVQUAL) dimensions automatically from hospital Facebook reviews using a machine learning classifier, and to examine their associations with patient dissatisfaction. From January 2017 to December 2019, empirical research was conducted in which POR were gathered from the official Facebook page of Malaysian public hospitals. To find SERVQUAL dimensions in POR, a machine learning topic classification utilising supervised learning was developed, and this study's objective was established using logistic regression analysis. It was discovered that 73.5% of patients were satisfied with the public hospital service, whereas 26.5% were dissatisfied. SERVQUAL dimensions identified were 13.2% reviews of tangible, 68.9% of reliability, 6.8% of responsiveness, 19.5% of assurance, and 64.3% of empathy. After controlling for hospital variables, all SERVQUAL dimensions except tangible and assurance were shown to be significantly related with patient dissatisfaction (reliability, p < 0.001; responsiveness, p = 0.016; and empathy, p < 0.001). Rural hospitals had a higher probability of patient dissatisfaction (p < 0.001). Therefore, POR, assisted by machine learning technologies, provided a pragmatic and feasible way for capturing patient perceptions of care quality and supplementing conventional patient satisfaction surveys. The findings offer critical information that will assist healthcare authorities in capitalising on POR by monitoring and evaluating the quality of services in real time.
Collapse
Affiliation(s)
- Afiq Izzudin A. Rahim
- Department of Community Medicine, School of Medical Science, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Kelantan, Malaysia; (A.I.A.R.); (K.I.M.)
| | - Mohd Ismail Ibrahim
- Department of Community Medicine, School of Medical Science, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Kelantan, Malaysia; (A.I.A.R.); (K.I.M.)
| | - Kamarul Imran Musa
- Department of Community Medicine, School of Medical Science, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Kelantan, Malaysia; (A.I.A.R.); (K.I.M.)
| | - Sook-Ling Chua
- Faculty of Computing and Informatics, Multimedia University, Persiaran Multimedia, Cyberjaya 63100, Selangor, Malaysia;
| | - Najib Majdi Yaacob
- Unit of Biostatistics and Research Methodology, Health Campus, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Kelantan, Malaysia;
| |
Collapse
|
11
|
A. Rahim AI, Ibrahim MI, Musa KI, Chua SL, Yaacob NM. Assessing Patient-Perceived Hospital Service Quality and Sentiment in Malaysian Public Hospitals Using Machine Learning and Facebook Reviews. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:9912. [PMID: 34574835 PMCID: PMC8466628 DOI: 10.3390/ijerph18189912] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 09/17/2021] [Accepted: 09/18/2021] [Indexed: 02/05/2023]
Abstract
Social media is emerging as a new avenue for hospitals and patients to solicit input on the quality of care. However, social media data is unstructured and enormous in volume. Moreover, no empirical research on the use of social media data and perceived hospital quality of care based on patient online reviews has been performed in Malaysia. The purpose of this study was to investigate the determinants of positive sentiment expressed in hospital Facebook reviews in Malaysia, as well as the association between hospital accreditation and sentiments expressed in Facebook reviews. From 2017 to 2019, we retrieved comments from 48 official public hospitals' Facebook pages. We used machine learning to build a sentiment analyzer and service quality (SERVQUAL) classifier that automatically classifies the sentiment and SERVQUAL dimensions. We utilized logistic regression analysis to determine our goals. We evaluated a total of 1852 reviews and our machine learning sentiment analyzer detected 72.1% of positive reviews and 27.9% of negative reviews. We classified 240 reviews as tangible, 1257 reviews as trustworthy, 125 reviews as responsive, 356 reviews as assurance, and 1174 reviews as empathy using our machine learning SERVQUAL classifier. After adjusting for hospital characteristics, all SERVQUAL dimensions except Tangible were associated with positive sentiment. However, no significant relationship between hospital accreditation and online sentiment was discovered. Facebook reviews powered by machine learning algorithms provide valuable, real-time data that may be missed by traditional hospital quality assessments. Additionally, online patient reviews offer a hitherto untapped indication of quality that may benefit all healthcare stakeholders. Our results confirm prior studies and support the use of Facebook reviews as an adjunct method for assessing the quality of hospital services in Malaysia.
Collapse
Affiliation(s)
- Afiq Izzudin A. Rahim
- Department of Community Medicine, School of Medical Science, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Kelantan, Malaysia; (A.I.A.R.); (K.I.M.)
| | - Mohd Ismail Ibrahim
- Department of Community Medicine, School of Medical Science, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Kelantan, Malaysia; (A.I.A.R.); (K.I.M.)
| | - Kamarul Imran Musa
- Department of Community Medicine, School of Medical Science, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Kelantan, Malaysia; (A.I.A.R.); (K.I.M.)
| | - Sook-Ling Chua
- Faculty of Computing and Informatics, Multimedia University, Persiaran Multimedia, Cyberjaya 63100, Selangor, Malaysia;
| | - Najib Majdi Yaacob
- Units of Biostatistics and Research Methodology, School of Medical Sciences, Health Campus, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Kelantan, Malaysia;
| |
Collapse
|
12
|
Baker W, Colditz JB, Dobbs PD, Mai H, Visweswaran S, Zhan J, Primack BA. Classification of Twitter Vaping Discourse Using BERTweet: Comparative Deep Learning Study (Preprint). JMIR Med Inform 2021; 10:e33678. [PMID: 35862172 PMCID: PMC9353682 DOI: 10.2196/33678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2021] [Revised: 03/21/2022] [Accepted: 05/08/2022] [Indexed: 11/13/2022] Open
Abstract
Background Objective Methods Results Conclusions
Collapse
Affiliation(s)
- William Baker
- Department of Computer Science and Computer Engineering, University of Arkansas, Fayetteville, AR, United States
| | - Jason B Colditz
- Division of General Internal Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States
| | - Page D Dobbs
- Health, Human Performance and Recreation Department, University of Arkansas, Fayetteville, AR, United States
| | - Huy Mai
- Department of Computer Science and Computer Engineering, University of Arkansas, Fayetteville, AR, United States
| | - Shyam Visweswaran
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, United States
| | - Justin Zhan
- Department of Computer Science and Computer Engineering, University of Arkansas, Fayetteville, AR, United States
| | - Brian A Primack
- College of Public Health and Human Science, Oregon State University, Corvallis, OR, United States
| |
Collapse
|
13
|
Jongenelis MI, Jongenelis G, Alexander E, Kennington K, Phillips F, Pettigrew S. A content analysis of the tweets of e-cigarette proponents in Australia. Health Promot J Austr 2021; 33:445-450. [PMID: 34143553 DOI: 10.1002/hpja.510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 06/16/2021] [Indexed: 11/09/2022] Open
Abstract
ISSUE ADDRESSED Social media sites have become platforms for public discourse on e-cigarettes, providing proponents with an opportunity to disseminate favourable information about the devices. Research examining the information being presented by Australian proponents of e-cigarettes is limited. Accordingly, this study explored the Twitter feeds of Australian proponents of e-cigarettes to determine the nature of the e-cigarette-related content being disseminated. METHODS All publicly available e-cigarette-related tweets and retweets (n = 1397) disseminated over a 15-week period by five Australian e-cigarette proponents were captured and analysed. RESULTS The main topics covered in the 1397 tweets analysed related to (a) criticism of the arguments made by public health agencies/advocates who oppose e-cigarettes (29%), (b) Australian e-cigarette policy (19%), (c) the health risks of e-cigarettes (16%) and (d) the efficacy of e-cigarettes as smoking cessation aids (13%). Proponents argued that the precautionary principle adopted by public health agencies/advocates lacks an appropriate evidence base and that legalising e-cigarettes would reduce smoking rates and smoking-related harm. Proponents minimised the risks associated with e-cigarette use and only presented evidence indicating that use facilitates smoking cessation. CONCLUSIONS The assessed tweets have the potential to reduce the public's trust in the information being presented by authoritative public health agencies/advocates. The dissemination of information downplaying the health risks associated with e-cigarettes may distort perceptions of the devices. SO WHAT?: To assist tobacco control efforts, results highlight the need for (a) ongoing surveillance of the tweets of e-cigarette proponents and (b) provision of evidence-based counterarguments on social media.
Collapse
Affiliation(s)
- Michelle I Jongenelis
- Melbourne Centre for Behaviour Change, Melbourne School of Psychological Sciences, University of Melbourne, Parkville, VIC, Australia
| | | | | | | | | | - Simone Pettigrew
- The George Institute for Global Health, University of New South Wales, Newtown, Australia
| |
Collapse
|
14
|
Analyzing Twitter Data to Evaluate People's Attitudes towards Public Health Policies and Events in the Era of COVID-19. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:ijerph18126272. [PMID: 34200576 PMCID: PMC8296042 DOI: 10.3390/ijerph18126272] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 06/07/2021] [Accepted: 06/08/2021] [Indexed: 11/17/2022]
Abstract
Policymakers and relevant public health authorities can analyze people’s attitudes towards public health policies and events using sentiment analysis. Sentiment analysis focuses on classifying and analyzing text sentiments. A Twitter sentiment analysis has the potential to monitor people’s attitudes towards public health policies and events. Here, we explore the feasibility of using Twitter data to build a surveillance system for monitoring people’s attitudes towards public health policies and events since the beginning of the COVID-19 pandemic. In this study, we conducted a sentiment analysis of Twitter data. We analyzed the relationship between the sentiment changes in COVID-19-related tweets and public health policies and events. Furthermore, to improve the performance of the early trained model, we developed a data preprocessing approach by using the pre-trained model and early Twitter data, which were available at the beginning of the pandemic. Our study identified a strong correlation between the sentiment changes in COVID-19-related Twitter data and public health policies and events. Additionally, the experimental results suggested that the data preprocessing approach improved the performance of the early trained model. This study verified the feasibility of developing a fast and low-human-effort surveillance system for monitoring people’s attitudes towards public health policies and events during a pandemic by analyzing Twitter data. Based on the pre-trained model and early Twitter data, we can quickly build a model for the surveillance system.
Collapse
|
15
|
Collecting a Large Scale Dataset for Classifying Fake News Tweets Using Weak Supervision. FUTURE INTERNET 2021. [DOI: 10.3390/fi13050114] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
The problem of automatic detection of fake news in social media, e.g., on Twitter, has recently drawn some attention. Although, from a technical perspective, it can be regarded as a straight-forward, binary classification problem, the major challenge is the collection of large enough training corpora, since manual annotation of tweets as fake or non-fake news is an expensive and tedious endeavor, and recent approaches utilizing distributional semantics require large training corpora. In this paper, we introduce an alternative approach for creating a large-scale dataset for tweet classification with minimal user intervention. The approach relies on weak supervision and automatically collects a large-scale, but very noisy, training dataset comprising hundreds of thousands of tweets. As a weak supervision signal, we label tweets by their source, i.e., trustworthy or untrustworthy source, and train a classifier on this dataset. We then use that classifier for a different classification target, i.e., the classification of fake and non-fake tweets. Although the labels are not accurate according to the new classification target (not all tweets by an untrustworthy source need to be fake news, and vice versa), we show that despite this unclean, inaccurate dataset, the results are comparable to those achieved using a manually labeled set of tweets. Moreover, we show that the combination of the large-scale noisy dataset with a human labeled one yields more advantageous results than either of the two alone.
Collapse
|
16
|
Lee SJ, Liu J, Gibson LA, Hornik RC. Rating the Valence of Media Content about Electronic Cigarettes Using Crowdsourcing: Testing Rater Instructions and Estimating the Optimal Number of Raters. HEALTH COMMUNICATION 2021; 36:497-507. [PMID: 31830827 PMCID: PMC7292742 DOI: 10.1080/10410236.2019.1700882] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Electronic cigarettes (e-cigarettes) are a controversial public health topic due to their increasing popularity among youth and the uncertainty about their risks and benefits. Researchers have started to assess the valence of media content about e-cigarette use, mostly using expert coding. The current study aims to offer a methodological framework and guideline when using crowdsourcing to rate the valence of e-cigarette media content. Specifically, we present (1) an experiment to determine rating instructions that would result in reliable valence ratings and (2) an analysis to identify the optimal number of raters needed to replicate these ratings. Specifically, we compared ratings produced by crowdsourced raters instructed to rate from several different perspectives (e.g., objective vs. subjective) and determined the instructions that led to reliable ratings. We then used bootstrapping methods and a set of criteria to identify the minimum number of raters needed to replicate these ratings. Results suggested that when rating e-cigarette valence, instructing raters to rate from their own subjective perspective produced reliable results, and nine raters were deemed the optimal number of raters. We expect these findings to inform future content analyses of e-cigarette valence. The study procedures can be applied to crowdsourced content analyses of other health-related media content to determine appropriate rating instructions and the number of raters.
Collapse
Affiliation(s)
- Stella Juhyun Lee
- Harvard University, TH Chan School of Public Health
- Dana-Farber Cancer Institute, Population Sciences Division, Center for Community-Based Research
| | - Jiaying Liu
- University of Georgia, Department of Communication Studies
| | - Laura A. Gibson
- University of Pennsylvania, Perelman School of Medicine, Department of Medical Ethics and Health Policy
| | | |
Collapse
|
17
|
Kim K, Gibson LA, Williams S, Kim Y, Binns S, Emery SL, Hornik RC. Valence of Media Coverage About Electronic Cigarettes and Other Tobacco Products From 2014 to 2017: Evidence From Automated Content Analysis. Nicotine Tob Res 2021; 22:1891-1900. [PMID: 32428214 DOI: 10.1093/ntr/ntaa090] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Accepted: 05/13/2020] [Indexed: 12/20/2022]
Abstract
INTRODUCTION As media exposure can influence people's opinions and perceptions about vaping and smoking, analyzing the valence of media content about tobacco products (ie, overall attitude toward tobacco, cigars, electronic cigarettes, etc.) is an important issue. This study advances the field by analyzing a large amount of media content about multiple tobacco products across six different media sources. AIMS AND METHODS From May 2014 to December 2017, we collected all English-language media items about tobacco products that U.S. young people might see from mass media and websites (long-form) and social media (Twitter and YouTube). We used supervised machine learning to develop validated algorithms to label the valence of these media items. Using the labeled results, we examined the impact of product type (e-cigarettes vs. other tobacco products), source (long-form vs. social media), and time (by month) on the valence of coverage. RESULTS We obtained 152 886 long-form media texts (20% with more than a passing mention), nearly 86 million tweets, and 12 262 YouTube videos about tobacco products. Most long-form media content opposed, while most social media coverage supported, the use of e-cigarettes and other tobacco products. Over time, within-source valence proportions were stable, though in aggregate, the amount of media coverage against the use of tobacco products decreased. CONCLUSIONS This study describes the U.S. public communication environment about vaping and smoking for young people and offers a novel big data approach to analyzing media content. Results suggest that content has gradually become less negative toward the use of e-cigarettes and other tobacco products. IMPLICATIONS This study is the first to examine how the valence of media coverage differs for e-cigarettes versus other tobacco products, across several media sources, and over time using a large corpus of media items. Unlike prior studies, these data allow us to draw conclusions about relative support and opposition for these two categories of products in a variety of media coverage because the same coding scheme was used across products and media sources.
Collapse
Affiliation(s)
- Kwanho Kim
- Annenberg School for Communication, University of Pennsylvania, Philadelphia, PA
| | - Laura A Gibson
- Annenberg School for Communication, University of Pennsylvania, Philadelphia, PA
| | - Sharon Williams
- Annenberg School for Communication, University of Pennsylvania, Philadelphia, PA
| | | | | | | | - Robert C Hornik
- Annenberg School for Communication, University of Pennsylvania, Philadelphia, PA
| |
Collapse
|
18
|
Singh T, Roberts K, Cohen T, Cobb N, Wang J, Fujimoto K, Myneni S. Social Media as a Research Tool (SMaaRT) for Risky Behavior Analytics: Methodological Review. JMIR Public Health Surveill 2020; 6:e21660. [PMID: 33252345 PMCID: PMC7735906 DOI: 10.2196/21660] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2020] [Revised: 10/05/2020] [Accepted: 11/06/2020] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Modifiable risky health behaviors, such as tobacco use, excessive alcohol use, being overweight, lack of physical activity, and unhealthy eating habits, are some of the major factors for developing chronic health conditions. Social media platforms have become indispensable means of communication in the digital era. They provide an opportunity for individuals to express themselves, as well as share their health-related concerns with peers and health care providers, with respect to risky behaviors. Such peer interactions can be utilized as valuable data sources to better understand inter-and intrapersonal psychosocial mediators and the mechanisms of social influence that drive behavior change. OBJECTIVE The objective of this review is to summarize computational and quantitative techniques facilitating the analysis of data generated through peer interactions pertaining to risky health behaviors on social media platforms. METHODS We performed a systematic review of the literature in September 2020 by searching three databases-PubMed, Web of Science, and Scopus-using relevant keywords, such as "social media," "online health communities," "machine learning," "data mining," etc. The reporting of the studies was directed by the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. Two reviewers independently assessed the eligibility of studies based on the inclusion and exclusion criteria. We extracted the required information from the selected studies. RESULTS The initial search returned a total of 1554 studies, and after careful analysis of titles, abstracts, and full texts, a total of 64 studies were included in this review. We extracted the following key characteristics from all of the studies: social media platform used for conducting the study, risky health behavior studied, the number of posts analyzed, study focus, key methodological functions and tools used for data analysis, evaluation metrics used, and summary of the key findings. The most commonly used social media platform was Twitter, followed by Facebook, QuitNet, and Reddit. The most commonly studied risky health behavior was nicotine use, followed by drug or substance abuse and alcohol use. Various supervised and unsupervised machine learning approaches were used for analyzing textual data generated from online peer interactions. Few studies utilized deep learning methods for analyzing textual data as well as image or video data. Social network analysis was also performed, as reported in some studies. CONCLUSIONS Our review consolidates the methodological underpinnings for analyzing risky health behaviors and has enhanced our understanding of how social media can be leveraged for nuanced behavioral modeling and representation. The knowledge gained from our review can serve as a foundational component for the development of persuasive health communication and effective behavior modification technologies aimed at the individual and population levels.
Collapse
Affiliation(s)
- Tavleen Singh
- School of Biomedical Informatics, The University of Texas Health Science Center, Houston, TX, United States
| | - Kirk Roberts
- School of Biomedical Informatics, The University of Texas Health Science Center, Houston, TX, United States
| | - Trevor Cohen
- Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, United States
| | - Nathan Cobb
- Georgetown University Medical Center, Washington, DC, United States
| | - Jing Wang
- School of Nursing, The University of Texas Health Science Center, San Antonio, TX, United States
| | - Kayo Fujimoto
- School of Public Health, The University of Texas Health Science Center, Houston, TX, United States
| | - Sahiti Myneni
- School of Biomedical Informatics, The University of Texas Health Science Center, Houston, TX, United States
| |
Collapse
|
19
|
Wang D, Lyu JC, Zhao X. Public Opinion About E-Cigarettes on Chinese Social Media: A Combined Study of Text Mining Analysis and Correspondence Analysis. J Med Internet Res 2020; 22:e19804. [PMID: 33052127 PMCID: PMC7593864 DOI: 10.2196/19804] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Revised: 09/04/2020] [Accepted: 09/07/2020] [Indexed: 12/24/2022] Open
Abstract
Background Electronic cigarettes (e-cigarettes) have become increasingly popular. China has accelerated its legislation on e-cigarettes in recent years by issuing two policies to regulate their use: the first on August 26, 2018, and the second on November 1, 2019. Social media provide an efficient platform to access information on the public opinion of e-cigarettes. Objective To gain insight into how policies have influenced the reaction of the Chinese public to e-cigarettes, this study aims to understand what the Chinese public say about e-cigarettes and how the focus of discussion might have changed in the context of policy implementation. Methods This study uses a combination of text mining and correspondence analysis to content analyze 1160 e-cigarette–related questions and their corresponding answers from Zhihu, China’s largest question-and-answer platform and one of the country’s most trustworthy social media sources. From January 1, 2017, to December 31, 2019, Python was used to text mine the most frequently used words and phrases in public e-cigarette discussions on Zhihu. The correspondence analysis was used to examine the similarities and differences between high-frequency words and phrases across 3 periods (ie, January 1, 2017, to August 27, 2018; August 28, 2018, to October 31, 2019; and November 1, 2019, to January 1, 2020). Results The results of the study showed that the consistent themes across time were comparisons with traditional cigarettes, health concerns, and how to choose e-cigarette products. The issuance of government policies on e-cigarettes led to a change in the focus of public discussion. The discussion of e-cigarettes in period 1 mainly focused on the use and experience of e-cigarettes. In period 2, the public’s attention was not only on the substances related to e-cigarettes but also on the smoking cessation functions of e-cigarettes. In period 3, the public shifted their attention to the e-cigarette industry and government policy on the banning of e-cigarette sales to minors. Conclusions Social media are an informative source, which can help policy makers and public health professionals understand the public’s concerns over and understanding of e-cigarettes. When there was little regulation, public discussion was greatly influenced by industry claims about e-cigarettes; however, once e-cigarette policies were issued, these policies, to a large extent, set the agenda for public discussion. In addition, media reporting of these policies might have greatly influenced the way e-cigarette policies were discussed. Therefore, monitoring e-cigarette discussions on social media and responding to them in a timely manner will both help improve the public’s e-cigarette literacy and facilitate the implementation of e-cigarette–related policies.
Collapse
Affiliation(s)
- Di Wang
- Faculty of Humanities and Arts, Macau University of Science and Technology, Taipa, Macao
| | - Joanne Chen Lyu
- Center for Tobacco Control Research and Education, University of California, San Francisco, CA, United States
| | - Xiaoyu Zhao
- Faculty of Humanities and Arts, Macau University of Science and Technology, Taipa, Macao
| |
Collapse
|
20
|
Mutanga MB, Abayomi A. Tweeting on COVID-19 pandemic in South Africa: LDA-based topic modelling approach. AFRICAN JOURNAL OF SCIENCE TECHNOLOGY INNOVATION & DEVELOPMENT 2020. [DOI: 10.1080/20421338.2020.1817262] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Murimo Bethel Mutanga
- Department of Information and Communication Technology, Mangosuthu University of Technology, Durban, South Africa
| | - Abdultaofeek Abayomi
- Department of Information and Communication Technology, Mangosuthu University of Technology, Durban, South Africa
| |
Collapse
|
21
|
Visweswaran S, Colditz JB, O'Halloran P, Han NR, Taneja SB, Welling J, Chu KH, Sidani JE, Primack BA. Machine Learning Classifiers for Twitter Surveillance of Vaping: Comparative Machine Learning Study. J Med Internet Res 2020; 22:e17478. [PMID: 32784184 PMCID: PMC7450367 DOI: 10.2196/17478] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Revised: 06/05/2020] [Accepted: 06/11/2020] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND Twitter presents a valuable and relevant social media platform to study the prevalence of information and sentiment on vaping that may be useful for public health surveillance. Machine learning classifiers that identify vaping-relevant tweets and characterize sentiments in them can underpin a Twitter-based vaping surveillance system. Compared with traditional machine learning classifiers that are reliant on annotations that are expensive to obtain, deep learning classifiers offer the advantage of requiring fewer annotated tweets by leveraging the large numbers of readily available unannotated tweets. OBJECTIVE This study aims to derive and evaluate traditional and deep learning classifiers that can identify tweets relevant to vaping, tweets of a commercial nature, and tweets with provape sentiments. METHODS We continuously collected tweets that matched vaping-related keywords over 2 months from August 2018 to October 2018. From this data set of tweets, a set of 4000 tweets was selected, and each tweet was manually annotated for relevance (vape relevant or not), commercial nature (commercial or not), and sentiment (provape or not). Using the annotated data, we derived traditional classifiers that included logistic regression, random forest, linear support vector machine, and multinomial naive Bayes. In addition, using the annotated data set and a larger unannotated data set of tweets, we derived deep learning classifiers that included a convolutional neural network (CNN), long short-term memory (LSTM) network, LSTM-CNN network, and bidirectional LSTM (BiLSTM) network. The unannotated tweet data were used to derive word vectors that deep learning classifiers can leverage to improve performance. RESULTS LSTM-CNN performed the best with the highest area under the receiver operating characteristic curve (AUC) of 0.96 (95% CI 0.93-0.98) for relevance, all deep learning classifiers including LSTM-CNN performed better than the traditional classifiers with an AUC of 0.99 (95% CI 0.98-0.99) for distinguishing commercial from noncommercial tweets, and BiLSTM performed the best with an AUC of 0.83 (95% CI 0.78-0.89) for provape sentiment. Overall, LSTM-CNN performed the best across all 3 classification tasks. CONCLUSIONS We derived and evaluated traditional machine learning and deep learning classifiers to identify vaping-related relevant, commercial, and provape tweets. Overall, deep learning classifiers such as LSTM-CNN had superior performance and had the added advantage of requiring no preprocessing. The performance of these classifiers supports the development of a vaping surveillance system.
Collapse
Affiliation(s)
- Shyam Visweswaran
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, United States
- Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, United States
| | - Jason B Colditz
- School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - Patrick O'Halloran
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, United States
| | - Na-Rae Han
- Department of Linguistics, University of Pittsburgh, Pittsburgh, PA, United States
| | - Sanya B Taneja
- Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, United States
| | - Joel Welling
- Pittsburgh Supercomputing Center, Carnegie Mellon University, Pittsburgh, PA, United States
| | - Kar-Hai Chu
- School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - Jaime E Sidani
- School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - Brian A Primack
- College of Education and Health Professions, University of Arkansas, Fayetteville, AR, United States
| |
Collapse
|
22
|
Sanders C, Nahar P, Small N, Hodgson D, Ong BN, Dehghan A, Sharp CA, Dixon WG, Lewis S, Kontopantelis E, Daker-White G, Bower P, Davies L, Kayesh H, Spencer R, McAvoy A, Boaden R, Lovell K, Ainsworth J, Nowakowska M, Shepherd A, Cahoon P, Hopkins R, Allen D, Lewis A, Nenadic G. Digital methods to enhance the usefulness of patient experience data in services for long-term conditions: the DEPEND mixed-methods study. HEALTH SERVICES AND DELIVERY RESEARCH 2020. [DOI: 10.3310/hsdr08280] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Background
Collecting NHS patient experience data is critical to ensure the delivery of high-quality services. Data are obtained from multiple sources, including service-specific surveys and widely used generic surveys. There are concerns about the timeliness of feedback, that some groups of patients and carers do not give feedback and that free-text feedback may be useful but is difficult to analyse.
Objective
To understand how to improve the collection and usefulness of patient experience data in services for people with long-term conditions using digital data capture and improved analysis of comments.
Design
The DEPEND study is a mixed-methods study with four parts: qualitative research to explore the perspectives of patients, carers and staff; use of computer science text-analytics methods to analyse comments; co-design of new tools to improve data collection and usefulness; and implementation and process evaluation to assess use of the tools and any impacts.
Setting
Services for people with severe mental illness and musculoskeletal conditions at four sites as exemplars to reflect both mental health and physical long-terms conditions: an acute trust (site A), a mental health trust (site B) and two general practices (sites C1 and C2).
Participants
A total of 100 staff members with diverse roles in patient experience management, clinical practice and information technology; 59 patients and 21 carers participated in the qualitative research components.
Interventions
The tools comprised a digital survey completed using a tablet device (kiosk) or a pen and paper/online version; guidance and information for patients, carers and staff; text-mining programs; reporting templates; and a process for eliciting and recording verbal feedback in community mental health services.
Results
We found a lack of understanding and experience of the process of giving feedback. People wanted more meaningful and informal feedback to suit local contexts. Text mining enabled systematic analysis, although challenges remained, and qualitative analysis provided additional insights. All sites managed to collect feedback digitally; however, there was a perceived need for additional resources, and engagement varied. Observation indicated that patients were apprehensive about using kiosks but often would participate with support. The process for collecting and recording verbal feedback in mental health services made sense to participants, but was not successfully adopted, with staff workload and technical problems often highlighted as barriers. Staff thought that new methods were insightful, but observation did not reveal changes in services during the testing period.
Conclusions
The use of digital methods can produce some improvements in the collection and usefulness of feedback. Context and flexibility are important, and digital methods need to be complemented with alternative methods. Text mining can provide useful analysis for reporting on large data sets within large organisations, but qualitative analysis may be more useful for small data sets and in small organisations.
Limitations
New practices need time and support to be adopted and this study had limited resources and a limited testing time.
Future work
Further research is needed to improve text-analysis methods for routine use in services and to evaluate the impact of methods (digital and non-digital) on service improvement in varied contexts and among diverse patients and carers.
Funding
This project was funded by the NIHR Health Services and Delivery Research programme and will be published in full in Health Services and Delivery Research; Vol. 8, No. 28. See the NIHR Journals Library website for further project information.
Collapse
Affiliation(s)
- Caroline Sanders
- National Institute for Health Research School for Primary Care Research, University of Manchester, Manchester, UK
| | - Papreen Nahar
- National Institute for Health Research School for Primary Care Research, University of Manchester, Manchester, UK
| | - Nicola Small
- National Institute for Health Research School for Primary Care Research, University of Manchester, Manchester, UK
| | - Damian Hodgson
- Alliance Manchester Business School, University of Manchester, Manchester, UK
| | - Bie Nio Ong
- National Institute for Health Research School for Primary Care Research, University of Manchester, Manchester, UK
| | - Azad Dehghan
- Department of Computer Science, University of Manchester, Manchester, UK
| | - Charlotte A Sharp
- National Institute for Health Research School for Primary Care Research, University of Manchester, Manchester, UK
| | - William G Dixon
- Centre for Epidemiology Versus Arthritis, Manchester Academic Health Science Centre, University of Manchester, Manchester, UK
| | - Shôn Lewis
- Division of Psychology and Mental Health, University of Manchester, Manchester, UK
| | - Evangelos Kontopantelis
- National Institute for Health Research School for Primary Care Research, University of Manchester, Manchester, UK
| | - Gavin Daker-White
- National Institute for Health Research School for Primary Care Research, University of Manchester, Manchester, UK
| | - Peter Bower
- National Institute for Health Research School for Primary Care Research, University of Manchester, Manchester, UK
| | - Linda Davies
- Centre for Health Economics, University of Manchester, Manchester, UK
| | - Humayun Kayesh
- Department of Computer Science, University of Manchester, Manchester, UK
| | - Rebecca Spencer
- National Institute for Health Research Collaboration for Leadership in Applied Health Research and Care Greater Manchester, Salford Royal NHS Foundation Trust, Salford, UK
| | - Aneela McAvoy
- National Institute for Health Research School for Primary Care Research, University of Manchester, Manchester, UK
- National Institute for Health Research Collaboration for Leadership in Applied Health Research and Care Greater Manchester, Salford Royal NHS Foundation Trust, Salford, UK
| | - Ruth Boaden
- Alliance Manchester Business School, University of Manchester, Manchester, UK
- National Institute for Health Research Collaboration for Leadership in Applied Health Research and Care Greater Manchester, Salford Royal NHS Foundation Trust, Salford, UK
| | - Karina Lovell
- National Institute for Health Research School for Primary Care Research, University of Manchester, Manchester, UK
- Division of Nursing, Midwifery and Social Work, University of Manchester, Manchester, UK
| | - John Ainsworth
- Centre for Health Informatics, University of Manchester, Manchester, UK
| | - Magdalena Nowakowska
- National Institute for Health Research School for Primary Care Research, University of Manchester, Manchester, UK
| | - Andrew Shepherd
- National Institute for Health Research School for Primary Care Research, University of Manchester, Manchester, UK
| | - Patrick Cahoon
- Greater Manchester Mental Health NHS Foundation Trust, Manchester, UK
| | - Richard Hopkins
- Greater Manchester Mental Health NHS Foundation Trust, Manchester, UK
| | | | | | - Goran Nenadic
- Department of Computer Science, University of Manchester, Manchester, UK
| |
Collapse
|
23
|
Understanding Discussions of Health Issues on Twitter: A Visual Analytic Study. Online J Public Health Inform 2020; 12:e2. [PMID: 32577151 DOI: 10.5210/ojphi.v12i1.10321] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Social media allows for the exploration of online discussions of health issues outside of traditional health spaces. Twitter is one of the largest social media platforms that allows users to post short comments (i.e., tweets). The unrestricted access to opinions and a large user base makes Twitter a major source for collection and quick dissemination of some health information. Health organizations, individuals, news organizations, businesses, and a host of other entities discuss health issues on Twitter. However, the enormous number of tweets presents challenges to those who seek to improve their knowledge of health issues. For instance, it is difficult to understand the overall sentiment on a health issue or the central message of the discourse. For Twitter to be an effective tool for health promotion, stakeholders need to be able to understand, analyze, and appraise health information and discussions on this platform. The purpose of this paper is to examine how a visual analytic study can provide insight into a variety of health issues on Twitter. Visual analytics enhances the understanding of data by combining computational models with interactive visualizations. Our study demonstrates how machine learning techniques and visualizations can be used to analyze and understand discussions of health issues on Twitter. In this paper, we report on the process of data collection, analysis of data, and representation of results. We present our findings and discuss the implications of this work to support the use of Twitter for health promotion.
Collapse
|
24
|
Mavragani A. Infodemiology and Infoveillance: Scoping Review. J Med Internet Res 2020; 22:e16206. [PMID: 32310818 PMCID: PMC7189791 DOI: 10.2196/16206] [Citation(s) in RCA: 118] [Impact Index Per Article: 29.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Revised: 02/05/2020] [Accepted: 02/08/2020] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Web-based sources are increasingly employed in the analysis, detection, and forecasting of diseases and epidemics, and in predicting human behavior toward several health topics. This use of the internet has come to be known as infodemiology, a concept introduced by Gunther Eysenbach. Infodemiology and infoveillance studies use web-based data and have become an integral part of health informatics research over the past decade. OBJECTIVE The aim of this paper is to provide a scoping review of the state-of-the-art in infodemiology along with the background and history of the concept, to identify sources and health categories and topics, to elaborate on the validity of the employed methods, and to discuss the gaps identified in current research. METHODS The PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines were followed to extract the publications that fall under the umbrella of infodemiology and infoveillance from the JMIR, PubMed, and Scopus databases. A total of 338 documents were extracted for assessment. RESULTS Of the 338 studies, the vast majority (n=282, 83.4%) were published with JMIR Publications. The Journal of Medical Internet Research features almost half of the publications (n=168, 49.7%), and JMIR Public Health and Surveillance has more than one-fifth of the examined studies (n=74, 21.9%). The interest in the subject has been increasing every year, with 2018 featuring more than one-fourth of the total publications (n=89, 26.3%), and the publications in 2017 and 2018 combined accounted for more than half (n=171, 50.6%) of the total number of publications in the last decade. The most popular source was Twitter with 45.0% (n=152), followed by Google with 24.6% (n=83), websites and platforms with 13.9% (n=47), blogs and forums with 10.1% (n=34), Facebook with 8.9% (n=30), and other search engines with 5.6% (n=19). As for the subjects examined, conditions and diseases with 17.2% (n=58) and epidemics and outbreaks with 15.7% (n=53) were the most popular categories identified in this review, followed by health care (n=39, 11.5%), drugs (n=40, 10.4%), and smoking and alcohol (n=29, 8.6%). CONCLUSIONS The field of infodemiology is becoming increasingly popular, employing innovative methods and approaches for health assessment. The use of web-based sources, which provide us with information that would not be accessible otherwise and tackles the issues arising from the time-consuming traditional methods, shows that infodemiology plays an important role in health informatics research.
Collapse
Affiliation(s)
- Amaryllis Mavragani
- Department of Computing Science and Mathematics, Faculty of Natural Sciences, University of Stirling, Stirling, United Kingdom
| |
Collapse
|
25
|
Kim Y, Kim JH. Using photos for public health communication: A computational analysis of the Centers for Disease Control and Prevention Instagram photos and public responses. Health Informatics J 2020; 26:2159-2180. [PMID: 31969051 DOI: 10.1177/1460458219896673] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
This study aims to explore the use of Instagram by the Centers for Disease Control and Prevention, one of the representative public health authorities in the United States. For this aim, all of the photos uploaded on the Centers for Disease Control and Prevention Instagram account were crawled and the content of them were analyzed using Microsoft Azure Cognitive Services. Also, engagement was measured by the sum of numbers of likes and comments to each photo, and sentiment analysis of comments was conducted. Results suggest that the photos that can be categorized into "text" and "people" took the largest share in the Centers for Disease Control and Prevention Instagram photos. And it was found that the Centers for Disease Control and Prevention's major way of delivering messages on Instagram was to imprint key messages that call for actions for better health on photos and to provide the source of complementary information on text component of each post. It was also found that photos with more and bigger human faces had lower level of engagement than the others, and happiness and neutral emotions expressed on the faces in photos were negatively associated with engagement. The features whose high value would make the photos look splendid and gaudy were negatively correlated with engagement, but sharpness was positively correlated.
Collapse
Affiliation(s)
- Yunhwan Kim
- Hankuk University of Foreign Studies, South Korea
| | | |
Collapse
|
26
|
Gibson LA, Siegel L, Kranzler E, Volinsky A, O'Donnell MB, Williams S, Yang Q, Kim Y, Binns S, Tran H, Maidel Epstein V, Leffel T, Jeong M, Liu J, Lee S, Emery S, Hornik RC. Combining Crowd-Sourcing and Automated Content Methods to Improve Estimates of Overall Media Coverage: Theme Mentions in E-cigarette and Other Tobacco Coverage. JOURNAL OF HEALTH COMMUNICATION 2019; 24:889-899. [PMID: 31718524 PMCID: PMC9173594 DOI: 10.1080/10810730.2019.1682724] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Exposure to media content can shape public opinions about tobacco. Accurately describing content is a first step to showing such effects. Historically, content analyses have hand-coded tobacco-focused texts from a few media sources which ignored passing mention coverage and social media sources, and could not reliably capture over-time variation. By using a combination of crowd-sourced and automated coding, we labeled the population of all e-cigarette and other tobacco-related (including cigarettes, hookah, cigars, etc.) 'long-form texts' (focused and passing coverage, in mass media and website articles) and social media items (tweets and YouTube videos) collected May 2014-June 2017 for four tobacco control themes. Automated coding of theme coverage met thresholds for item-level precision and recall, event validation, and weekly-level reliability for most sources, except YouTube. Health, Policy, Addiction and Youth themes were frequent in e-cigarette long-form focused coverage (44%-68%), but not in long-form passing coverage (5%-22%). These themes were less frequent in other tobacco coverage (long-form focused (13-32%) and passing coverage (4-11%)). Themes were infrequent in both e-cigarette (1-3%) and other tobacco tweets (2-4%). Findings demonstrate that passing e-cigarette and other tobacco long-form coverage and social media sources paint different pictures of theme coverage than focused long-form coverage. Automated coding also allowed us to code the amount of data required to estimate reliable weekly theme coverage over three years. E-cigarette theme coverage showed much more week-to-week variation than did other tobacco coverage. Automated coding allows accurate descriptions of theme coverage in passing mentions, social media, and trends in weekly theme coverage.
Collapse
Affiliation(s)
- Laura A Gibson
- Annenberg School for Communication, University of Pennsylvania, Philadelphia, PA, USA
| | - Leeann Siegel
- Annenberg School for Communication, University of Pennsylvania, Philadelphia, PA, USA
| | - Elissa Kranzler
- Annenberg School for Communication, University of Pennsylvania, Philadelphia, PA, USA
| | - Allyson Volinsky
- Annenberg School for Communication, University of Pennsylvania, Philadelphia, PA, USA
| | - Matthew B O'Donnell
- Annenberg School for Communication, University of Pennsylvania, Philadelphia, PA, USA
| | - Sharon Williams
- Annenberg School for Communication, University of Pennsylvania, Philadelphia, PA, USA
| | - Qinghua Yang
- Annenberg School for Communication, University of Pennsylvania, Philadelphia, PA, USA
| | - Yoonsang Kim
- NORC at the University of Chicago, Chicago, IL, USA
| | - Steven Binns
- NORC at the University of Chicago, Chicago, IL, USA
| | - Hy Tran
- NORC at the University of Chicago, Chicago, IL, USA
| | | | | | - Michelle Jeong
- Annenberg School for Communication, University of Pennsylvania, Philadelphia, PA, USA
| | - Jiaying Liu
- Annenberg School for Communication, University of Pennsylvania, Philadelphia, PA, USA
| | - Stella Lee
- Annenberg School for Communication, University of Pennsylvania, Philadelphia, PA, USA
| | - Sherry Emery
- NORC at the University of Chicago, Chicago, IL, USA
| | - Robert C Hornik
- Annenberg School for Communication, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
27
|
Leung R. Increasing the Impact of JMIR Journals in the Attention Economy. J Med Internet Res 2019; 21:e16172. [PMID: 31674916 PMCID: PMC6914247 DOI: 10.2196/16172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2019] [Revised: 10/14/2019] [Accepted: 10/22/2019] [Indexed: 12/04/2022] Open
Abstract
The Journal of Medical Internet Research (JMIR) has attained remarkable achievements in the past twenty years. By depth, JMIR has published the most impactful research in medical informatics and is top ranked in the field. By width, JMIR has spun off to about thirty sister journals to cover topics such as serious games, mobile health, public health, surveillance, and other medical areas. With ever-increasing data and research findings, academic publishers need to be competitive to win readers' attention. While JMIR is well-positioned in the field, the journal will need more creative strategies to increase its attention base and maintain its leading position. Viable strategies include the creation of online collaborative spaces, the engagement of more diverse audience from less traditional channels, and partnerships with other publishers and academic institutes. Doing so could also enable JMIR researchers to turn research insights into practical strategies to improve personal health and medical services.
Collapse
Affiliation(s)
- Ricky Leung
- University at Albany, School of Public Health, Rensselaer, NY, United States
| |
Collapse
|
28
|
Eysenbach G, Colditz J, Malik M, Yates T, Primack B. Identifying Key Target Audiences for Public Health Campaigns: Leveraging Machine Learning in the Case of Hookah Tobacco Smoking. J Med Internet Res 2019; 21:e12443. [PMID: 31287063 PMCID: PMC6643764 DOI: 10.2196/12443] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2018] [Revised: 03/31/2019] [Accepted: 05/20/2019] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Hookah tobacco smoking (HTS) is a particularly important issue for public health professionals to address owing to its prevalence and deleterious health effects. Social media sites can be a valuable tool for public health officials to conduct informational health campaigns. Current social media platforms provide researchers with opportunities to better identify and target specific audiences and even individuals. However, we are not aware of systematic research attempting to identify audiences with mixed or ambivalent views toward HTS. OBJECTIVE The objective of this study was to (1) confirm previous research showing positively skewed HTS sentiment on Twitter using a larger dataset by leveraging machine learning techniques and (2) systematically identify individuals who exhibit mixed opinions about HTS via the Twitter platform and therefore represent key audiences for intervention. METHODS We prospectively collected tweets related to HTS from January to June 2016. We double-coded sentiment for a subset of approximately 5000 randomly sampled tweets for sentiment toward HTS and used these data to train a machine learning classifier to assess the remaining approximately 556,000 HTS-related Twitter posts. Natural language processing software was used to extract linguistic features (ie, language-based covariates). The data were processed by machine learning tools and algorithms using R. Finally, we used the results to identify individuals who, because they had consistently posted both positive and negative content, might be ambivalent toward HTS and represent an ideal audience for intervention. RESULTS There were 561,960 HTS-related tweets: 373,911 were classified as positive and 183,139 were classified as negative. A set of 12,861 users met a priori criteria indicating that they posted both positive and negative tweets about HTS. CONCLUSIONS Sentiment analysis can allow researchers to identify audience segments on social media that demonstrate ambiguity toward key public health issues, such as HTS, and therefore represent ideal populations for intervention. Using large social media datasets can help public health officials to preemptively identify specific audience segments that would be most receptive to targeted campaigns.
Collapse
Affiliation(s)
| | - Jason Colditz
- School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - Momin Malik
- Carnegie Mellon University, Pittsburgh, PA, United States
| | - Tabitha Yates
- School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - Brian Primack
- School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| |
Collapse
|
29
|
Chu KH, Allem JP, Unger JB, Cruz TB, Akbarpour M, Kirkpatrick MG. Strategies to find audience segments on Twitter for e-cigarette education campaigns. Addict Behav 2019; 91:222-226. [PMID: 30497815 DOI: 10.1016/j.addbeh.2018.11.015] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2018] [Revised: 11/08/2018] [Accepted: 11/14/2018] [Indexed: 12/27/2022]
Abstract
The development of public health education campaigns about tobacco products requires an understanding of specific audience segments including their views, intentions, use of media, perceived barriers, and benefits of change. For example, identifying and targeting individuals who express ambivalence about e-cigarette use on Twitter may be helpful in devising and focusing public health campaigns to reduce e-cigarette use. This study developed a novel analytic strategy using social network analysis to identify audience segments on Twitter based on positive, negative, and neutral e-cigarette sentiment. Using Twitter data collected from April 2015 to March 2016, we identified different sub-groups of users who retweeted about e-cigarettes, and measured each sub-group's clustering coefficient (CC), which describes how tightly people cluster together. Ten high CC and ten low CC groups were randomly selected; then 100 randomly selected tweets from each group were coded for e-cigarette sentiment (positive, negative, neutral). Results indicate that differences in e-cigarette sentiment are associated with clustering of Twitter network ties. Statistical analyses revealed that high CC groups were more likely to have strong e-cigarette sentiments, suggesting that tightly clustered groups may be "echo chambers" (i.e., like-minded people repeating the same messages). By contrast, low CC groups were more likely to have neutral sentiments, and had greater fluctuation in sentiment over time, suggesting that they may be more flexible in their opinions about e-cigarettes and may be particularly receptive to targeted public health campaigns. Informatics techniques such as determination of clusters using social network analysis can be useful in identifying audience segments for future public health campaigns.
Collapse
Affiliation(s)
- Kar-Hai Chu
- Center for Research on Media, Technology, and Health, University of Pittsburgh, 230 McKee Place, Suite 600, Pittsburgh, PA 15213, United States.
| | - Jon-Patrick Allem
- Department of Preventive Medicine, University of Southern California, 2001 North Soto Street, 3rd Floor, Los Angeles, CA 90032, United States
| | - Jennifer B Unger
- Department of Preventive Medicine, University of Southern California, 2001 North Soto Street, 3rd Floor, Los Angeles, CA 90032, United States
| | - Tess Boley Cruz
- Department of Preventive Medicine, University of Southern California, 2001 North Soto Street, 3rd Floor, Los Angeles, CA 90032, United States
| | - Meleeka Akbarpour
- Department of Preventive Medicine, University of Southern California, 2001 North Soto Street, 3rd Floor, Los Angeles, CA 90032, United States
| | - Matthew G Kirkpatrick
- Department of Preventive Medicine, University of Southern California, 2001 North Soto Street, 3rd Floor, Los Angeles, CA 90032, United States
| |
Collapse
|
30
|
Waring ME, Baker K, Peluso A, May CN, Pagoto SL. Content analysis of Twitter chatter about indoor tanning. Transl Behav Med 2019; 9:41-47. [PMID: 29474700 DOI: 10.1093/tbm/iby011] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Twitter may be useful for learning about indoor tanning behavior and attitudes. The objective of this study was to analyze the content of tweets about indoor tanning to determine the extent to which tweets are posted by people who tan, and to characterize the topics of tweets. We extracted 4,691 unique tweets from Twitter using the terms "tanning bed" or "tanning salon" over 7 days in March 2016. We content analyzed a random selection of 1,000 tweets, double-coding 20% of tweets (κ = 0.74, 81% agreement). Most tweets (71%) were by tanners (n = 699 individuals) and included tweets expressing positive sentiment about tanning (57%), and reports of a negative tanning experience (17%), burning (15%), or sleeping in a tanning bed (9%). Four percent of tweets were by tanning salon employees. Tweets posted by people unlikely to be tanners (15%) included tweets mocking tanners (71%) and health warnings (29%). The term "tanning bed" had higher precision for identifying individuals who engage in indoor tanning than "tanning salon"; 77% versus 45% of tweets captured by these search terms were by individuals who engaged in indoor tanning, respectively. Extrapolating to the full data set of 4,691 tweets, findings suggest that an average of 468 individuals who engage in indoor tanning can be identified by their tweets per day. The majority of tweets were from tanners and included reports of especially risky habits (e.g., burning, falling asleep). Twitter provides opportunity to identify indoor tanners and examine conversations about indoor tanning.
Collapse
Affiliation(s)
- Molly E Waring
- Department of Allied Health Sciences, University of Connecticut, Storrs, CT.,Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA
| | - Katie Baker
- Department of Community and Behavioral Health, East Tennessee State University College of Public Health, Johnson City, TN
| | - Anthony Peluso
- Department of Community and Behavioral Health, East Tennessee State University College of Public Health, Johnson City, TN
| | - Christine N May
- Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA.,Division of Preventive and Behavioral Medicine, Department of Medicine, University of Massachusetts Medical School, Worcester, MA
| | - Sherry L Pagoto
- Department of Allied Health Sciences, University of Connecticut, Storrs, CT.,Division of Preventive and Behavioral Medicine, Department of Medicine, University of Massachusetts Medical School, Worcester, MA
| |
Collapse
|
31
|
Walsh EI, Busby Grant J. Detecting Temporal Cognition in Text: Comparison of Judgements by Self, Expert and Machine. Front Psychol 2018; 9:2037. [PMID: 30416468 PMCID: PMC6212561 DOI: 10.3389/fpsyg.2018.02037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2018] [Accepted: 10/03/2018] [Indexed: 11/20/2022] Open
Abstract
Background: There is a growing research focus on temporal cognition, due to its importance in memory and planning, and links with psychological wellbeing. Researchers are increasingly using diary studies, experience sampling and social media data to study temporal thought. However, it remains unclear whether such reports can be accurately interpreted for temporal orientation. In this study, temporal orientation judgements about text reports of thoughts were compared across human coding, automatic text mining, and participant self-report. Methods: 214 participants responded to randomly timed text message prompts, categorically reporting the temporal direction of their thoughts and describing the content of their thoughts, producing a corpus of 2505 brief (1–358, M = 43 characters) descriptions. Two researchers independently, blindly coded temporal orientation of the descriptions. Four approaches to automated coding used tense to establish temporal category for each description. Concordance between temporal orientation assessments by self-report, human coding, and automatic text mining was evaluated. Results: Human coding more closely matched self-reported coding than automated methods. Accuracy for human (79.93% correct) and automated (57.44% correct) coding was diminished when multiple guesses at ambiguous temporal categories (ties) were allowed in coding (reduction to 74.95% correct for human, 49.05% automated). Conclusion: Ambiguous tense poses a challenge for both human and automated coding protocols that attempt to infer temporal orientation from text describing momentary thought. While methods can be applied to minimize bias, this study demonstrates that researchers need to be wary about attributing temporal orientation to text-reported thought processes, and emphasize the importance of eliciting self-reported judgements.
Collapse
Affiliation(s)
- Erin I Walsh
- Centre for Research on Ageing, Health & Wellbeing, Australian National University, Canberra, ACT, Australia
| | - Janie Busby Grant
- Centre for Applied Psychology, University of Canberra, Canberra, ACT, Australia
| |
Collapse
|
32
|
Li Q, Wang C, Liu R, Wang L, Zeng DD, Leischow SJ. Understanding Users' Vaping Experiences from Social Media: Initial Study Using Sentiment Opinion Summarization Techniques. J Med Internet Res 2018; 20:e252. [PMID: 30111530 PMCID: PMC6115599 DOI: 10.2196/jmir.9373] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Revised: 05/21/2018] [Accepted: 07/10/2018] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND E-liquid is one of the main components in electronic nicotine delivery systems (ENDS). ENDS review comments could serve as an early warning on use patterns and even function to serve as an indicator of problems or adverse events pertaining to the use of specific e-liquids-much like types of responses tracked by the Food and Drug Administration (FDA) regarding medications. OBJECTIVE This study aimed to understand users' "vaping" experience using sentiment opinion summarization techniques, which can help characterize how consumers think about specific e-liquids and their characteristics (eg, flavor, throat hit, and vapor production). METHODS We collected e-liquid reviews on JuiceDB from June 27, 2013 to December 31, 2017 using its public application programming interface. The dataset contains 27,070 reviews for 8058 e-liquid products. Each review is accompanied by an overall rating and a set of 4 aspect ratings of an e-liquid, each on a scale of 1-5: flavor accuracy, throat hit, value, and cloud production. An iterative dichotomiser 3 (ID3)-based influential aspect analysis model was adopted to learn the key elements that impact e-liquid use. Then, fine-grained sentiment analysis was employed to mine opinions on various aspects of vaping experience related to e-liquids. RESULTS We found that flavor accuracy and value were the two most important aspects that affected users' sentiments toward e-liquids. Of reviews in JuiceDB, 67.83% (18,362/27,070) were positive, while 12.67% (3430/27,070) were negative. This indicates that users generally hold positive attitudes toward e-liquids. Among the 9 flavors, fruity and sweet were the two most popular. Great and sweet tastes, reasonable value, and strong throat hit made users satisfied with fruity and sweet flavors, whereas "strange" tastes made users dislike those flavors. Meanwhile, users complained about some e-liquids' steep or expensive prices, bad quality, and harsh throat hit. There were 2342 fruity e-liquids and 2049 sweet e-liquids. There were 55.81% (1307/2342) and 59.83% (1226/2049) positive sentiments and 13.62% (319/2342) and 12.88% (264/2049) negative sentiments toward fruity e-liquids and sweet e-liquids, respectively. Great flavors and good vapors contributed to positive reviews of fruity and sweet products. However, bad tastes such as "sour" or "bitter" resulted in negative reviews. These findings can help businesses and policy makers to further improve product quality and formulate effective policy. CONCLUSIONS This study provides an effective mechanism for analyzing users' ENDS vaping experience based on sentiment opinion summarization techniques. Sentiment opinions on aspect and products can be found using our method, which is of great importance to monitor e-liquid products and improve work efficiency.
Collapse
Affiliation(s)
- Qiudan Li
- The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China
| | - Can Wang
- The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Ruoran Liu
- The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Lei Wang
- The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China
| | - Daniel Dajun Zeng
- The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
- Department of Management Information Systems, Eller College of Management, The University of Arizona, Tucson, AZ, United States
| | - Scott James Leischow
- College of Health Solutions, Arizona State University, Phoenix, AZ, United States
| |
Collapse
|
33
|
Martinez LS, Hughes S, Walsh-Buhi ER, Tsou MH. "Okay, We Get It. You Vape": An Analysis of Geocoded Content, Context, and Sentiment regarding E-Cigarettes on Twitter. JOURNAL OF HEALTH COMMUNICATION 2018; 23:550-562. [PMID: 29979920 DOI: 10.1080/10810730.2018.1493057] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The current study examined conversations on Twitter related to use and perceptions of e-cigarettes in the United States. We employed the Social Media Analytic and Research Testbed (SMART) dashboard, which was used to identify and download (via a public API) e-cigarette-related geocoded tweets. E-cigarette-related tweets were collected continuously using customized geo-targeted Twitter APIs. A total of 193,051 tweets were collected between October 2015 and February 2016. Of these tweets, a random sample of 973 geocoded tweets were selected and manually coded for information regarding source, context, and message characteristics. Our findings reveal that although over half of tweets were positive, a sizeable portion was negative or neutral. We also found that, among those tweets mentioning a stigma of e-cigarettes, most confirmed that a stigma does exist. Conversely, among tweets mentioning the harmfulness of e-cigarettes, most denied that e-cigarettes were a health hazard. These results suggest that current efforts have left the public with ambiguity regarding the potential dangers of e-cigarettes. Consequently, it is critical to communicate the public health stance on this issue to inform the public and provide counterarguments to the positive sentiments presently dominating conversations about e-cigarettes on social media. The lack of awareness and need to voice a public health position on e-cigarettes represents a vital opportunity to continue winning gains for tobacco control and prevention efforts through health communication interventions targeting e-cigarettes.
Collapse
Affiliation(s)
- Lourdes S Martinez
- a School of Communication (619-594-8512) , San Diego State University , San Diego , CA , USA
| | - Sharon Hughes
- b Graduate School of Public Health (619-594-6317) , San Diego State University , San Diego , CA , USA
| | - Eric R Walsh-Buhi
- b Graduate School of Public Health (619-594-6317) , San Diego State University , San Diego , CA , USA
| | - Ming-Hsiang Tsou
- c Department of Geography (619-594-0205) , San Diego State University , San Diego , CA , USA
| |
Collapse
|
34
|
Daniulaityte R, Lamy FR, Smith GA, Nahhas RW, Carlson RG, Thirunarayan K, Martins SS, Boyer EW, Sheth A. "Retweet to Pass the Blunt": Analyzing Geographic and Content Features of Cannabis-Related Tweeting Across the United States. J Stud Alcohol Drugs 2018; 78:910-915. [PMID: 29087826 DOI: 10.15288/jsad.2017.78.910] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
OBJECTIVE Twitter data offer new possibilities for tracking health-related communications. This study is among the first to apply advanced information processing to identify geographic and content features of cannabis-related tweeting in the United States. METHOD Tweets were collected using streaming Application Programming Interface (March-May 2016) and were processed by eDrugTrends to identify geolocation and classify content by source (personal communication, media, retail) and sentiment (positive, negative, neutral). States were grouped by cannabis legalization policies into "recreational," "medical, less restrictive," "medical, more restrictive," and "illegal." Permutation tests were performed to analyze differences among four groups in adjusted percentages of all tweets, unique users, personal communications only, and positive-to-negative sentiment ratios. RESULTS About 30% of all 13,233,837 cannabis-related tweets had identifiable state-level geo-information. Among geolocated tweets, 76.2% were personal communications, 21.1% media, and 2.7% retail. About 71% of personal communication tweets expressed positive sentiment toward cannabis; 16% expressed negative sentiment. States in the recreational group had significantly greater average adjusted percentage of cannabis tweets (3.01%) compared with other groups. For personal communication tweets only, the recreational group (2.47%) was significantly greater than the medical, more restrictive (1.84%) and illegal (1.85%) groups. Similarly, the recreational group had significantly greater average positive-to-negative sentiment ratio (4.64) compared with the medical, more restrictive (4.15) and illegal (4.19) groups. Average adjusted percentages of unique users showed similar differences between recreational and other groups. CONCLUSIONS States with less restrictive policies displayed greater cannabis-related tweeting and conveyed more positive sentiment. The study demonstrates the potential of Twitter data to become a valuable indicator of drug-related communications in the context of varying policy environments.
Collapse
Affiliation(s)
- Raminta Daniulaityte
- Center for Interventions, Treatment, and Addictions Research (CITAR), Department of Population and Public Health Sciences, Wright State University Boonshoft School of Medicine, Dayton, Ohio.,Ohio Center of Excellence in Knowledge-Enabled Computing (Kno.e.sis), Department of Computer Science and Engineering, Wright State University, Dayton, Ohio
| | - Francois R Lamy
- Center for Interventions, Treatment, and Addictions Research (CITAR), Department of Population and Public Health Sciences, Wright State University Boonshoft School of Medicine, Dayton, Ohio.,Ohio Center of Excellence in Knowledge-Enabled Computing (Kno.e.sis), Department of Computer Science and Engineering, Wright State University, Dayton, Ohio
| | - G Alan Smith
- Ohio Center of Excellence in Knowledge-Enabled Computing (Kno.e.sis), Department of Computer Science and Engineering, Wright State University, Dayton, Ohio
| | - Ramzi W Nahhas
- Department of Population and Public Health Sciences, Wright State University Boonshoft School of Medicine, Dayton, Ohio.,Department of Psychiatry, Wright State University Boonshoft School of Medicine, Dayton, Ohio
| | - Robert G Carlson
- Center for Interventions, Treatment, and Addictions Research (CITAR), Department of Population and Public Health Sciences, Wright State University Boonshoft School of Medicine, Dayton, Ohio.,Ohio Center of Excellence in Knowledge-Enabled Computing (Kno.e.sis), Department of Computer Science and Engineering, Wright State University, Dayton, Ohio
| | - Krishnaprasad Thirunarayan
- Ohio Center of Excellence in Knowledge-Enabled Computing (Kno.e.sis), Department of Computer Science and Engineering, Wright State University, Dayton, Ohio
| | - Silvia S Martins
- Department of Epidemiology, Columbia University Mailman School of Public Health, New York, New York
| | - Edward W Boyer
- Department of Emergency Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - Amit Sheth
- Ohio Center of Excellence in Knowledge-Enabled Computing (Kno.e.sis), Department of Computer Science and Engineering, Wright State University, Dayton, Ohio
| |
Collapse
|
35
|
Gohil S, Vuik S, Darzi A. Sentiment Analysis of Health Care Tweets: Review of the Methods Used. JMIR Public Health Surveill 2018; 4:e43. [PMID: 29685871 PMCID: PMC5938573 DOI: 10.2196/publichealth.5789] [Citation(s) in RCA: 74] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2016] [Revised: 10/31/2016] [Accepted: 03/14/2017] [Indexed: 01/24/2023] Open
Abstract
Background Twitter is a microblogging service where users can send and read short 140-character messages called “tweets.” There are several unstructured, free-text tweets relating to health care being shared on Twitter, which is becoming a popular area for health care research. Sentiment is a metric commonly used to investigate the positive or negative opinion within these messages. Exploring the methods used for sentiment analysis in Twitter health care research may allow us to better understand the options available for future research in this growing field. Objective The first objective of this study was to understand which tools would be available for sentiment analysis of Twitter health care research, by reviewing existing studies in this area and the methods they used. The second objective was to determine which method would work best in the health care settings, by analyzing how the methods were used to answer specific health care questions, their production, and how their accuracy was analyzed. Methods A review of the literature was conducted pertaining to Twitter and health care research, which used a quantitative method of sentiment analysis for the free-text messages (tweets). The study compared the types of tools used in each case and examined methods for tool production, tool training, and analysis of accuracy. Results A total of 12 papers studying the quantitative measurement of sentiment in the health care setting were found. More than half of these studies produced tools specifically for their research, 4 used open source tools available freely, and 2 used commercially available software. Moreover, 4 out of the 12 tools were trained using a smaller sample of the study’s final data. The sentiment method was trained against, on an average, 0.45% (2816/627,024) of the total sample data. One of the 12 papers commented on the analysis of accuracy of the tool used. Conclusions Multiple methods are used for sentiment analysis of tweets in the health care setting. These range from self-produced basic categorizations to more complex and expensive commercial software. The open source and commercial methods are developed on product reviews and generic social media messages. None of these methods have been extensively tested against a corpus of health care messages to check their accuracy. This study suggests that there is a need for an accurate and tested tool for sentiment analysis of tweets trained using a health care setting–specific corpus of manually annotated tweets first.
Collapse
Affiliation(s)
- Sunir Gohil
- Imperial College London, Department of Surgery and Cancer, London, United Kingdom
| | - Sabine Vuik
- Imperial College London, Department of Surgery and Cancer, London, United Kingdom
| | - Ara Darzi
- Imperial College London, Department of Surgery and Cancer, London, United Kingdom
| |
Collapse
|
36
|
Mackey TK, Kalyanam J, Katsuki T, Lanckriet G. Twitter-Based Detection of Illegal Online Sale of Prescription Opioid. Am J Public Health 2017; 107:1910-1915. [PMID: 29048960 DOI: 10.2105/ajph.2017.303994] [Citation(s) in RCA: 53] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
OBJECTIVES To deploy a methodology accurately identifying tweets marketing the illegal online sale of controlled substances. METHODS We first collected tweets from the Twitter public application program interface stream filtered for prescription opioid keywords. We then used unsupervised machine learning (specifically, topic modeling) to identify topics associated with illegal online marketing and sales. Finally, we conducted Web forensic analyses to characterize different types of online vendors. We analyzed 619 937 tweets containing the keywords codeine, Percocet, fentanyl, Vicodin, Oxycontin, oxycodone, and hydrocodone over a 5-month period from June to November 2015. RESULTS A total of 1778 tweets (< 1%) were identified as marketing the sale of controlled substances online; 90% had imbedded hyperlinks, but only 46 were "live" at the time of the evaluation. Seven distinct URLs linked to Web sites marketing or illegally selling controlled substances online. CONCLUSIONS Our methodology can identify illegal online sale of prescription opioids from large volumes of tweets. Our results indicate that controlled substances are trafficked online via different strategies and vendors. Public Health Implications. Our methodology can be used to identify illegal online sellers in criminal violation of the Ryan Haight Online Pharmacy Consumer Protection Act.
Collapse
Affiliation(s)
- Tim K Mackey
- Tim K. Mackey is with the Department of Anesthesiology and Department of Medicine, University of California, San Diego, and the Global Health Policy Institute, San Diego. Janani Kalyanam is with the Global Health Policy Institute and the Department of Electrical and Computer Engineering, University of California, San Diego. Takeo Katsuki is with the Kavli Institute for Brain and Mind, University of California, San Diego. Gert Lanckriet is with the Department of Electrical and Computer Engineering, University of California, San Diego
| | - Janani Kalyanam
- Tim K. Mackey is with the Department of Anesthesiology and Department of Medicine, University of California, San Diego, and the Global Health Policy Institute, San Diego. Janani Kalyanam is with the Global Health Policy Institute and the Department of Electrical and Computer Engineering, University of California, San Diego. Takeo Katsuki is with the Kavli Institute for Brain and Mind, University of California, San Diego. Gert Lanckriet is with the Department of Electrical and Computer Engineering, University of California, San Diego
| | - Takeo Katsuki
- Tim K. Mackey is with the Department of Anesthesiology and Department of Medicine, University of California, San Diego, and the Global Health Policy Institute, San Diego. Janani Kalyanam is with the Global Health Policy Institute and the Department of Electrical and Computer Engineering, University of California, San Diego. Takeo Katsuki is with the Kavli Institute for Brain and Mind, University of California, San Diego. Gert Lanckriet is with the Department of Electrical and Computer Engineering, University of California, San Diego
| | - Gert Lanckriet
- Tim K. Mackey is with the Department of Anesthesiology and Department of Medicine, University of California, San Diego, and the Global Health Policy Institute, San Diego. Janani Kalyanam is with the Global Health Policy Institute and the Department of Electrical and Computer Engineering, University of California, San Diego. Takeo Katsuki is with the Kavli Institute for Brain and Mind, University of California, San Diego. Gert Lanckriet is with the Department of Electrical and Computer Engineering, University of California, San Diego
| |
Collapse
|
37
|
Kagashe I, Yan Z, Suheryani I. Enhancing Seasonal Influenza Surveillance: Topic Analysis of Widely Used Medicinal Drugs Using Twitter Data. J Med Internet Res 2017; 19:e315. [PMID: 28899847 PMCID: PMC5617904 DOI: 10.2196/jmir.7393] [Citation(s) in RCA: 49] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2017] [Revised: 06/09/2017] [Accepted: 07/26/2017] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Uptake of medicinal drugs (preventive or treatment) is among the approaches used to control disease outbreaks, and therefore, it is of vital importance to be aware of the counts or frequencies of most commonly used drugs and trending topics about these drugs from consumers for successful implementation of control measures. Traditional survey methods would have accomplished this study, but they are too costly in terms of resources needed, and they are subject to social desirability bias for topics discovery. Hence, there is a need to use alternative efficient means such as Twitter data and machine learning (ML) techniques. OBJECTIVE Using Twitter data, the aim of the study was to (1) provide a methodological extension for efficiently extracting widely consumed drugs during seasonal influenza and (2) extract topics from the tweets of these drugs and to infer how the insights provided by these topics can enhance seasonal influenza surveillance. METHODS From tweets collected during the 2012-13 flu season, we first identified tweets with mentions of drugs and then constructed an ML classifier using dependency words as features. The classifier was used to extract tweets that evidenced consumption of drugs, out of which we identified the mostly consumed drugs. Finally, we extracted trending topics from each of these widely used drugs' tweets using latent Dirichlet allocation (LDA). RESULTS Our proposed classifier obtained an F1 score of 0.82, which significantly outperformed the two benchmark classifiers (ie, P<.001 with the lexicon-based and P=.048 with the 1-gram term frequency [TF]). The classifier extracted 40,428 tweets that evidenced consumption of drugs out of 50,828 tweets with mentions of drugs. The most widely consumed drugs were influenza virus vaccines that had around 76.95% (31,111/40,428) share of the total; other notable drugs were Theraflu, DayQuil, NyQuil, vitamins, acetaminophen, and oseltamivir. The topics of each of these drugs exhibited common themes or experiences from people who have consumed these drugs. Among these were the enabling and deterrent factors to influenza drugs uptake, which are keys to mitigating the severity of seasonal influenza outbreaks. CONCLUSIONS The study results showed the feasibility of using tweets of widely consumed drugs to enhance seasonal influenza surveillance in lieu of the traditional or conventional surveillance approaches. Public health officials and other stakeholders can benefit from the findings of this study, especially in enhancing strategies for mitigating the severity of seasonal influenza outbreaks. The proposed methods can be extended to the outbreaks of other diseases.
Collapse
Affiliation(s)
- Ireneus Kagashe
- School of Management and Economics, Beijing Institute of Technology, Beijing, China
| | - Zhijun Yan
- School of Management and Economics, Beijing Institute of Technology, Beijing, China
- Sustainable Development Research Institute for Economy and Society of Beijing, Beijing, China
| | - Imran Suheryani
- School of Life Science, Department of Biomedical Engineering, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
38
|
|
39
|
Park E, Chang HJ, Nam HS. Use of Machine Learning Classifiers and Sensor Data to Detect Neurological Deficit in Stroke Patients. J Med Internet Res 2017; 19:e120. [PMID: 28420599 PMCID: PMC5413803 DOI: 10.2196/jmir.7092] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2016] [Revised: 02/02/2017] [Accepted: 03/05/2017] [Indexed: 12/21/2022] Open
Abstract
Background The pronator drift test (PDT), a neurological examination, is widely used in clinics to measure motor weakness of stroke patients. Objective The aim of this study was to develop a PDT tool with machine learning classifiers to detect stroke symptoms based on quantification of proximal arm weakness using inertial sensors and signal processing. Methods We extracted features of drift and pronation from accelerometer signals of wearable devices on the inner wrists of 16 stroke patients and 10 healthy controls. Signal processing and feature selection approach were applied to discriminate PDT features used to classify stroke patients. A series of machine learning techniques, namely support vector machine (SVM), radial basis function network (RBFN), and random forest (RF), were implemented to discriminate stroke patients from controls with leave-one-out cross-validation. Results Signal processing by the PDT tool extracted a total of 12 PDT features from sensors. Feature selection abstracted the major attributes from the 12 PDT features to elucidate the dominant characteristics of proximal weakness of stroke patients using machine learning classification. Our proposed PDT classifiers had an area under the receiver operating characteristic curve (AUC) of .806 (SVM), .769 (RBFN), and .900 (RF) without feature selection, and feature selection improves the AUCs to .913 (SVM), .956 (RBFN), and .975 (RF), representing an average performance enhancement of 15.3%. Conclusions Sensors and machine learning methods can reliably detect stroke signs and quantify proximal arm weakness. Our proposed solution will facilitate pervasive monitoring of stroke patients.
Collapse
Affiliation(s)
- Eunjeong Park
- Cardiovascular Research Institute, Yonsei University College of Medicine, Seoul, Republic Of Korea
| | - Hyuk-Jae Chang
- Department of Cardiology, Yonsei University College of Medicine, Seoul, Republic Of Korea
| | - Hyo Suk Nam
- Department of Neurology, Yonsei University College of Medicine, Seoul, Republic Of Korea
| |
Collapse
|
40
|
Lienemann BA, Unger JB, Cruz TB, Chu KH. Methods for Coding Tobacco-Related Twitter Data: A Systematic Review. J Med Internet Res 2017; 19:e91. [PMID: 28363883 PMCID: PMC5392207 DOI: 10.2196/jmir.7022] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2016] [Revised: 01/26/2017] [Accepted: 02/23/2017] [Indexed: 11/24/2022] Open
Abstract
Background As Twitter has grown in popularity to 313 million monthly active users, researchers have increasingly been using it as a data source for tobacco-related research. Objective The objective of this systematic review was to assess the methodological approaches of categorically coded tobacco Twitter data and make recommendations for future studies. Methods Data sources included PsycINFO, Web of Science, PubMed, ABI/INFORM, Communication Source, and Tobacco Regulatory Science. Searches were limited to peer-reviewed journals and conference proceedings in English from January 2006 to July 2016. The initial search identified 274 articles using a Twitter keyword and a tobacco keyword. One coder reviewed all abstracts and identified 27 articles that met the following inclusion criteria: (1) original research, (2) focused on tobacco or a tobacco product, (3) analyzed Twitter data, and (4) coded Twitter data categorically. One coder extracted data collection and coding methods. Results E-cigarettes were the most common type of Twitter data analyzed, followed by specific tobacco campaigns. The most prevalent data sources were Gnip and Twitter’s Streaming application programming interface (API). The primary methods of coding were hand-coding and machine learning. The studies predominantly coded for relevance, sentiment, theme, user or account, and location of user. Conclusions Standards for data collection and coding should be developed to be able to more easily compare and replicate tobacco-related Twitter results. Additional recommendations include the following: sample Twitter’s databases multiple times, make a distinction between message attitude and emotional tone for sentiment, code images and URLs, and analyze user profiles. Being relatively novel and widely used among adolescents and black and Hispanic individuals, Twitter could provide a rich source of tobacco surveillance data among vulnerable populations.
Collapse
Affiliation(s)
- Brianna A Lienemann
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, United States
| | - Jennifer B Unger
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, United States
| | - Tess Boley Cruz
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, United States
| | - Kar-Hai Chu
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, United States
| |
Collapse
|
41
|
Lazard AJ, Wilcox GB, Tuttle HM, Glowacki EM, Pikowski J. Public reactions to e-cigarette regulations on Twitter: a text mining analysis. Tob Control 2017; 26:e112-e116. [PMID: 28341768 DOI: 10.1136/tobaccocontrol-2016-053295] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2016] [Revised: 02/22/2017] [Accepted: 03/06/2017] [Indexed: 11/04/2022]
Abstract
BACKGROUND In May 2016, the Food and Drug Administration (FDA) issued a final rule that deemed e-cigarettes to be within their regulatory authority as a tobacco product. News and opinions about the regulation were shared on social media platforms, such as Twitter, which can play an important role in shaping the public's attitudes. We analysed information shared on Twitter for insights into initial public reactions. METHODS A text mining approach was used to uncover important topics among reactions to the e-cigarette regulations on Twitter. SAS Text Miner V.12.1 software was used for descriptive text mining to uncover the primary topics from tweets collected from May 1 to May 17 2016 using NUVI software to gather the data. RESULTS A total of nine topics were generated. These topics reveal initial reactions to whether the FDA's e-cigarette regulations will benefit or harm public health, how the regulations will impact the emerging e-cigarette market and efforts to share the news. The topics were dominated by negative or mixed reactions. CONCLUSIONS In the days following the FDA's announcement of the new deeming regulations, the public reaction on Twitter was largely negative. Public health advocates should consider using social media outlets to better communicate the policy's intentions, reach and potential impact for public good to create a more balanced conversation.
Collapse
Affiliation(s)
- Allison J Lazard
- School of Media and Journalism, University of North Carolina at Chapel Hill, Chapel Hill, USA
| | - Gary B Wilcox
- Stan Richards School of Advertising and Public Relations, University of Texas at Austin, Austin, USA.,Center for Health Communication, University of Texas at Austin, Austin, USA
| | - Hannah M Tuttle
- Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, USA
| | - Elizabeth M Glowacki
- Center for Health Communication, University of Texas at Austin, Austin, USA.,Department of Communication Studies, University of Texas at Austin, Austin, USA
| | - Jessica Pikowski
- School of Media and Journalism, University of North Carolina at Chapel Hill, Chapel Hill, USA
| |
Collapse
|
42
|
Beaunoyer E, Arsenault M, Lomanowska AM, Guitton MJ. Understanding online health information: Evaluation, tools, and strategies. PATIENT EDUCATION AND COUNSELING 2017; 100:183-189. [PMID: 27595436 DOI: 10.1016/j.pec.2016.08.028] [Citation(s) in RCA: 99] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2016] [Revised: 07/18/2016] [Accepted: 08/25/2016] [Indexed: 06/06/2023]
Abstract
OBJECTIVE Considering the status of the Internet as a prominent source of health information, assessing online health material has become a central issue in patient education. We describe the strategies available to evaluate the characteristics of online health information, including readability, emotional content, understandability, usability. METHODS Popular tools used in assessment of readability, emotional content and comprehensibility of online health information were reviewed. Tools designed to evaluate both printed and online material were considered. RESULTS Readability tools are widely used in online health material evaluation and are highly covariant. Assessment of emotional content of online health-related communications via sentiment analysis tools is becoming more popular. Understandability and usability tools have been developed specifically for health-related material, but each tool has important limitations and has been tested on a limited number of health issues. CONCLUSION Despite the availability of numerous assessment tools, their overall reliability differs between readability (high) and understandability (low). Approaches combining multiple assessment tools and involving both quantitative and qualitative observations would optimize assessment strategies. PRACTICE IMPLICATIONS Effective assessment of online health information should rely on mixed strategies combining quantitative and qualitative evaluations. Assessment tools should be selected according to their functional properties and compatibility with target material.
Collapse
Affiliation(s)
- Elisabeth Beaunoyer
- Faculty of Medicine, Laval University, Quebec City, QC, Canada; Institut Universitaire en Santé Mentale de Québec, Quebec City, QC, Canada
| | - Marianne Arsenault
- Faculty of Medicine, Laval University, Quebec City, QC, Canada; Institut Universitaire en Santé Mentale de Québec, Quebec City, QC, Canada
| | - Anna M Lomanowska
- Department of Psychology, University of Toronto Mississauga, Mississauga, ON, Canada
| | - Matthieu J Guitton
- Faculty of Medicine, Laval University, Quebec City, QC, Canada; Institut Universitaire en Santé Mentale de Québec, Quebec City, QC, Canada.
| |
Collapse
|
43
|
Zhan Y, Liu R, Li Q, Leischow SJ, Zeng DD. Identifying Topics for E-Cigarette User-Generated Contents: A Case Study From Multiple Social Media Platforms. J Med Internet Res 2017; 19:e24. [PMID: 28108428 PMCID: PMC5291865 DOI: 10.2196/jmir.5780] [Citation(s) in RCA: 65] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2016] [Revised: 08/14/2016] [Accepted: 11/23/2016] [Indexed: 11/25/2022] Open
Abstract
Background Electronic cigarette (e-cigarette) is an emerging product with a rapid-growth market in recent years. Social media has become an important platform for information seeking and sharing. We aim to mine hidden topics from e-cigarette datasets collected from different social media platforms. Objective This paper aims to gain a systematic understanding of the characteristics of various types of social media, which will provide deep insights into how consumers and policy makers effectively use social media to track e-cigarette-related content and adjust their decisions and policies. Methods We collected data from Reddit (27,638 e-cigarette flavor-related posts from January 1, 2011, to June 30, 2015), JuiceDB (14,433 e-juice reviews from June 26, 2013 to November 12, 2015), and Twitter (13,356 “e-cig ban”-related tweets from January, 1, 2010 to June 30, 2015). Latent Dirichlet Allocation, a generative model for topic modeling, was used to analyze the topics from these data. Results We found four types of topics across the platforms: (1) promotions, (2) flavor discussions, (3) experience sharing, and (4) regulation debates. Promotions included sales from vendors to users, as well as trades among users. A total of 10.72% (2,962/27,638) of the posts from Reddit were related to trading. Promotion links were found between social media platforms. Most of the links (87.30%) in JuiceDB were related to Reddit posts. JuiceDB and Reddit identified consistent flavor categories. E-cigarette vaping methods and features such as steeping, throat hit, and vapor production were broadly discussed both on Reddit and on JuiceDB. Reddit provided space for policy discussions and majority of the posts (60.7%) holding a negative attitude toward regulations, whereas Twitter was used to launch campaigns using certain hashtags. Our findings are based on data across different platforms. The topic distribution between Reddit and JuiceDB was significantly different (P<.001), which indicated that the user discussions focused on different perspectives across the platforms. Conclusions This study examined Reddit, JuiceDB, and Twitter as social media data sources for e-cigarette research. These mined findings could be further used by other researchers and policy makers. By utilizing the automatic topic-modeling method, the proposed unified feedback model could be a useful tool for policy makers to comprehensively consider how to collect valuable feedback from social media.
Collapse
Affiliation(s)
- Yongcheng Zhan
- Department of Management Information Systems, Eller College of Management, The University of Arizona, Tucson, AZ, United States
| | - Ruoran Liu
- The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Qiudan Li
- The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China
| | | | - Daniel Dajun Zeng
- Department of Management Information Systems, Eller College of Management, The University of Arizona, Tucson, AZ, United States.,The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
44
|
Lazard AJ, Saffer AJ, Wilcox GB, Chung AD, Mackert MS, Bernhardt JM. E-Cigarette Social Media Messages: A Text Mining Analysis of Marketing and Consumer Conversations on Twitter. JMIR Public Health Surveill 2016; 2:e171. [PMID: 27956376 PMCID: PMC5187450 DOI: 10.2196/publichealth.6551] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2016] [Revised: 10/31/2016] [Accepted: 11/16/2016] [Indexed: 01/10/2023] Open
Abstract
Background As the use of electronic cigarettes (e-cigarettes) rises, social media likely influences public awareness and perception of this emerging tobacco product. Objective This study examined the public conversation on Twitter to determine overarching themes and insights for trending topics from commercial and consumer users. Methods Text mining uncovered key patterns and important topics for e-cigarettes on Twitter. SAS Text Miner 12.1 software (SAS Institute Inc) was used for descriptive text mining to reveal the primary topics from tweets collected from March 24, 2015, to July 3, 2015, using a Python script in conjunction with Twitter’s streaming application programming interface. A total of 18 keywords related to e-cigarettes were used and resulted in a total of 872,544 tweets that were sorted into overarching themes through a text topic node for tweets (126,127) and retweets (114,451) that represented more than 1% of the conversation. Results While some of the final themes were marketing-focused, many topics represented diverse proponent and user conversations that included discussion of policies, personal experiences, and the differentiation of e-cigarettes from traditional tobacco, often by pointing to the lack of evidence for the harm or risks of e-cigarettes or taking the position that e-cigarettes should be promoted as smoking cessation devices. Conclusions These findings reveal that unique, large-scale public conversations are occurring on Twitter alongside e-cigarette advertising and promotion. Proponents and users are turning to social media to share knowledge, experience, and questions about e-cigarette use. Future research should focus on these unique conversations to understand how they influence attitudes towards and use of e-cigarettes.
Collapse
Affiliation(s)
- Allison J Lazard
- School of Media and JournalismUniversity of North Carolina at Chapel HillChapel Hill, NCUnited States.,Center for Health CommunicationMoody College of CommunicationThe University of Texas at AustinAustin, TXUnited States
| | - Adam J Saffer
- School of Media and JournalismUniversity of North Carolina at Chapel HillChapel Hill, NCUnited States
| | - Gary B Wilcox
- Center for Health CommunicationStan Richards School of Advertising and Public RelationsThe University of Texas at AustinAustin, TXUnited States
| | - Arnold DongWoo Chung
- Center for Health CommunicationStan Richards School of Advertising and Public RelationsThe University of Texas at AustinAustin, TXUnited States
| | - Michael S Mackert
- Center for Health CommunicationStan Richards School of Advertising and Public RelationsThe University of Texas at AustinAustin, TXUnited States
| | - Jay M Bernhardt
- Center for Health CommunicationMoody College of CommunicationThe University of Texas at AustinAustin, TXUnited States
| |
Collapse
|
45
|
Han S, Kavuluru R. Exploratory Analysis of Marketing and Non-marketing E-cigarette Themes on Twitter. SOCIAL INFORMATICS : 8TH INTERNATIONAL CONFERENCE, SOCINFO 2016, BELLEVUE, WA, USA, NOVEMBER 11-14, 2016, PROCEEDINGS. PART II. SOCINFO (CONFERENCE) (8TH : 2016 : BELLEVUE, WASH.) 2016; 10047:307-322. [PMID: 28782062 PMCID: PMC5540097 DOI: 10.1007/978-3-319-47874-6_22] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/16/2023]
Abstract
Electronic cigarettes (e-cigs) have been gaining popularity and have emerged as a controversial tobacco product since their introduction in 2007 in the U.S. The smoke-free aspect of e-cigs renders them less harmful than conventional cigarettes and is one of the main reasons for their use by people who plan to quit smoking. The US food and drug administration (FDA) has introduced new regulations early May 2016 that went into effect on August 8, 2016. Given this important context, in this paper, we report results of a project to identify current themes in e-cig tweets in terms of semantic interpretations of topics generated with topic modeling. Given marketing/advertising tweets constitute almost half of all e-cig tweets, we first build a classifier that identifies marketing and non-marketing tweets based on a hand-built dataset of 1000 tweets. After applying the classifier to a dataset of over a million tweets (collected during 4/2015 - 6/2016), we conduct a preliminary content analysis and run topic models on the two sets of tweets separately after identifying the appropriate numbers of topics using topic coherence. We interpret the results of the topic modeling process by relating topics generated to specific e-cig themes. We also report on themes identified from e-cig tweets generated at particular places (such as schools and churches) for geo-tagged tweets found in our dataset using the GeoNames API. To our knowledge, this is the first effort that employs topic modeling to identify e-cig themes in general and in the context of geo-tagged tweets tied to specific places of interest.
Collapse
Affiliation(s)
- Sifei Han
- Department of Computer Science, University of Kentucky, Lexington, KY, USA
| | - Ramakanth Kavuluru
- Division of Biomedical Informatics, Department of Internal Medicine, University of Kentucky, Lexington, KY, USA
- Department of Computer Science, University of Kentucky, Lexington, KY, USA
| |
Collapse
|
46
|
Daniulaityte R, Chen L, Lamy FR, Carlson RG, Thirunarayan K, Sheth A. "When 'Bad' is 'Good'": Identifying Personal Communication and Sentiment in Drug-Related Tweets. JMIR Public Health Surveill 2016; 2:e162. [PMID: 27777215 PMCID: PMC5099500 DOI: 10.2196/publichealth.6327] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2016] [Revised: 08/27/2016] [Accepted: 09/21/2016] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND To harness the full potential of social media for epidemiological surveillance of drug abuse trends, the field needs a greater level of automation in processing and analyzing social media content. OBJECTIVES The objective of the study is to describe the development of supervised machine-learning techniques for the eDrugTrends platform to automatically classify tweets by type/source of communication (personal, official/media, retail) and sentiment (positive, negative, neutral) expressed in cannabis- and synthetic cannabinoid-related tweets. METHODS Tweets were collected using Twitter streaming Application Programming Interface and filtered through the eDrugTrends platform using keywords related to cannabis, marijuana edibles, marijuana concentrates, and synthetic cannabinoids. After creating coding rules and assessing intercoder reliability, a manually labeled data set (N=4000) was developed by coding several batches of randomly selected subsets of tweets extracted from the pool of 15,623,869 collected by eDrugTrends (May-November 2015). Out of 4000 tweets, 25% (1000/4000) were used to build source classifiers and 75% (3000/4000) were used for sentiment classifiers. Logistic Regression (LR), Naive Bayes (NB), and Support Vector Machines (SVM) were used to train the classifiers. Source classification (n=1000) tested Approach 1 that used short URLs, and Approach 2 where URLs were expanded and included into the bag-of-words analysis. For sentiment classification, Approach 1 used all tweets, regardless of their source/type (n=3000), while Approach 2 applied sentiment classification to personal communication tweets only (2633/3000, 88%). Multiclass and binary classification tasks were examined, and machine-learning sentiment classifier performance was compared with Valence Aware Dictionary for sEntiment Reasoning (VADER), a lexicon and rule-based method. The performance of each classifier was assessed using 5-fold cross validation that calculated average F-scores. One-tailed t test was used to determine if differences in F-scores were statistically significant. RESULTS In multiclass source classification, the use of expanded URLs did not contribute to significant improvement in classifier performance (0.7972 vs 0.8102 for SVM, P=.19). In binary classification, the identification of all source categories improved significantly when unshortened URLs were used, with personal communication tweets benefiting the most (0.8736 vs 0.8200, P<.001). In multiclass sentiment classification Approach 1, SVM (0.6723) performed similarly to NB (0.6683) and LR (0.6703). In Approach 2, SVM (0.7062) did not differ from NB (0.6980, P=.13) or LR (F=0.6931, P=.05), but it was over 40% more accurate than VADER (F=0.5030, P<.001). In multiclass task, improvements in sentiment classification (Approach 2 vs Approach 1) did not reach statistical significance (eg, SVM: 0.7062 vs 0.6723, P=.052). In binary sentiment classification (positive vs negative), Approach 2 (focus on personal communication tweets only) improved classification results, compared with Approach 1, for LR (0.8752 vs 0.8516, P=.04) and SVM (0.8800 vs 0.8557, P=.045). CONCLUSIONS The study provides an example of the use of supervised machine learning methods to categorize cannabis- and synthetic cannabinoid-related tweets with fairly high accuracy. Use of these content analysis tools along with geographic identification capabilities developed by the eDrugTrends platform will provide powerful methods for tracking regional changes in user opinions related to cannabis and synthetic cannabinoids use over time and across different regions.
Collapse
Affiliation(s)
- Raminta Daniulaityte
- Center for Interventions, Treatment, and Addictions Research, Department of Population and Public Health Sciences, Boonshoft School of Medicine, Wright State University, Kettering, OH, United States.
| | | | | | | | | | | |
Collapse
|
47
|
Hornik R. Measuring Campaign Message Exposure and Public Communication Environment Exposure: Some Implications of the Distinction in the Context of Social Media. COMMUNICATION METHODS AND MEASURES 2016; 10:167-169. [PMID: 27766123 PMCID: PMC5066806 DOI: 10.1080/19312458.2016.1150976] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Affiliation(s)
- Robert Hornik
- Annenberg School for Communication, University of Pennsylvania
| |
Collapse
|
48
|
Palomino M, Taylor T, Göker A, Isaacs J, Warber S. The Online Dissemination of Nature-Health Concepts: Lessons from Sentiment Analysis of Social Media Relating to "Nature-Deficit Disorder". INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2016; 13:E142. [PMID: 26797628 PMCID: PMC4730533 DOI: 10.3390/ijerph13010142] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2015] [Revised: 12/24/2015] [Accepted: 12/30/2015] [Indexed: 12/18/2022]
Abstract
Evidence continues to grow supporting the idea that restorative environments, green exercise, and nature-based activities positively impact human health. Nature-deficit disorder, a journalistic term proposed to describe the ill effects of people's alienation from nature, is not yet formally recognized as a medical diagnosis. However, over the past decade, the phrase has been enthusiastically taken up by some segments of the lay public. Social media, such as Twitter, with its opportunities to gather "big data" related to public opinions, offers a medium for exploring the discourse and dissemination around nature-deficit disorder and other nature-health concepts. In this paper, we report our experience of collecting more than 175,000 tweets, applying sentiment analysis to measure positive, neutral or negative feelings, and preliminarily mapping the impact on dissemination. Sentiment analysis is currently used to investigate the repercussions of events in social networks, scrutinize opinions about products and services, and understand various aspects of the communication in Web-based communities. Based on a comparison of nature-deficit-disorder "hashtags" and more generic nature hashtags, we make recommendations for the better dissemination of public health messages through changes to the framing of messages. We show the potential of Twitter to aid in better understanding the impact of the natural environment on human health and wellbeing.
Collapse
Affiliation(s)
- Marco Palomino
- School of Computing and Digital Media, Robert Gordon University, Aberdeen, Scotland AB10 7GE, UK.
| | - Tim Taylor
- European Centre for Environment and Human Health, University of Exeter Medical School, Truro, Cornwall TR1 3HD, UK.
| | - Ayse Göker
- School of Computing and Digital Media, Robert Gordon University, Aberdeen, Scotland AB10 7GE, UK.
| | - John Isaacs
- School of Computing and Digital Media, Robert Gordon University, Aberdeen, Scotland AB10 7GE, UK.
| | - Sara Warber
- European Centre for Environment and Human Health, University of Exeter Medical School, Truro, Cornwall TR1 3HD, UK.
- Department of Family Medicine, University of Michigan Medical School, Ann Arbor, MI 48104-1213, USA.
| |
Collapse
|
49
|
Chu KH, Unger JB, Allem JP, Pattarroyo M, Soto D, Cruz TB, Yang H, Jiang L, Yang CC. Diffusion of Messages from an Electronic Cigarette Brand to Potential Users through Twitter. PLoS One 2015; 10:e0145387. [PMID: 26684746 PMCID: PMC4694088 DOI: 10.1371/journal.pone.0145387] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2015] [Accepted: 12/03/2015] [Indexed: 11/19/2022] Open
Abstract
Objective This study explores the presence and actions of an electronic cigarette (e-cigarette) brand, Blu, on Twitter to observe how marketing messages are sent and diffused through the retweet (i.e., message forwarding) functionality. Retweet networks enable messages to reach additional Twitter users beyond the sender’s local network. We follow messages from their origin through multiple retweets to identify which messages have more reach, and the different users who are exposed. Methods We collected three months of publicly available data from Twitter. A combination of techniques in social network analysis and content analysis were applied to determine the various networks of users who are exposed to e-cigarette messages and how the retweet network can affect which messages spread. Results The Blu retweet network expanded during the study period. Analysis of user profiles combined with network cluster analysis showed that messages of certain topics were only circulated within a community of e-cigarette supporters, while other topics spread further, reaching more general Twitter users who may not support or use e-cigarettes. Conclusions Retweet networks can serve as proxy filters for marketing messages, as Twitter users decide which messages they will continue to diffuse among their followers. As certain e-cigarette messages extend beyond their point of origin, the audience being exposed expands beyond the e-cigarette community. Potential implications for health education campaigns include utilizing Twitter and targeting important gatekeepers or hubs that would maximize message diffusion.
Collapse
Affiliation(s)
- Kar-Hai Chu
- Department of Preventive Medicine, University of Southern California, Los Angeles, California, United States of America
- * E-mail:
| | - Jennifer B. Unger
- Department of Preventive Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Jon-Patrick Allem
- Department of Preventive Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Monica Pattarroyo
- Department of Preventive Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Daniel Soto
- Department of Preventive Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Tess Boley Cruz
- Department of Preventive Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Haodong Yang
- College of Computing and Informatics, Drexel University, Philadelphia, Pennsylvania, United States of America
| | - Ling Jiang
- College of Computing and Informatics, Drexel University, Philadelphia, Pennsylvania, United States of America
| | - Christopher C. Yang
- College of Computing and Informatics, Drexel University, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|