Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Meng HW, Kath S, Li D, Nguyen QC. National substance use patterns on Twitter. PLoS One 2017;12:e0187691. [PMID: 29107961 DOI: 10.1371/journal.pone.0187691] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Accepted: 10/23/2017] [Indexed: 01/14/2023] Open

For:	Meng HW, Kath S, Li D, Nguyen QC. National substance use patterns on Twitter. PLoS One 2017;12:e0187691. [PMID: 29107961 DOI: 10.1371/journal.pone.0187691] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Accepted: 10/23/2017] [Indexed: 01/14/2023] Open

Number

Cited by Other Article(s)

Castillo-Toledo C, Fraile-Martínez O, Donat-Vargas C, Lara-Abelenda FJ, Ortega MA, Garcia-Montero C, Mora F, Alvarez-Mon M, Quintero J, Alvarez-Mon MA. Insights from the Twittersphere: a cross-sectional study of public perceptions, usage patterns, and geographical differences of tweets discussing cocaine. Front Psychiatry 2024;15:1282026. [PMID: 38566955 PMCID: PMC10986306 DOI: 10.3389/fpsyt.2024.1282026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 02/27/2024] [Indexed: 04/04/2024] Open

Abstract

Introduction

Cocaine abuse represents a major public health concern. The social perception of cocaine has been changing over the decades, a phenomenon closely tied to its patterns of use and abuse. Twitter is a valuable tool to understand the status of drug use and abuse globally. However, no specific studies discussing cocaine have been conducted on this platform.

Methods

111,508 English and Spanish tweets containing "cocaine" from 2018 to 2022 were analyzed. 550 were manually studied, and the largest subset underwent automated classification. Then, tweets related to cocaine were analyzed to examine their content, types of Twitter users, usage patterns, health effects, and personal experiences. Geolocation data was also considered to understand regional differences.

Results

A total of 71,844 classifiable tweets were obtained. Among these, 15.95% of users discussed the harm of cocaine consumption to health. Media outlets had the highest number of tweets (35.11%) and the most frequent theme was social/political denunciation (67.88%). Regarding the experience related to consumption, there are more tweets with a negative sentiment. The 9.03% of tweets explicitly mention frequent use of the drug. The continent with the highest number of tweets was America (55.44% of the total).

Discussion

The findings underscore the significance of cocaine as a current social and political issue, with a predominant focus on political and social denunciation in the majority of tweets. Notably, the study reveals a concentration of tweets from the United States and South American countries, reflecting the high prevalence of cocaine-related disorders and overdose cases in these regions. Alarmingly, the study highlights the trivialization of cocaine consumption on Twitter, accompanied by a misleading promotion of its health benefits, emphasizing the urgent need for targeted interventions and antidrug content on social media platforms. Finally, the unexpected advocacy for cocaine by healthcare professionals raises concerns about potential drug abuse within this demographic, warranting further investigation.

Collapse

Affiliation(s)

Consuelo Castillo-Toledo Department of Psychiatry and Mental Health, Hospital Universitario Infanta Leonor, Madrid, Spain Department of Medicine and Medical Specialities, Faculty of Medicine and Health Sciences, University of Alcala, Alcala de Henares, Spain
Oscar Fraile-Martínez Department of Medicine and Medical Specialities, Faculty of Medicine and Health Sciences, University of Alcala, Alcala de Henares, Spain Ramón y Cajal Institute of Sanitary Research (IRYCIS), Madrid, Spain
Carolina Donat-Vargas Cardiovascular and Nutritional Epidemiology, Institute of Environmental Medicine, Karolinska Institute, Stockholm, Sweden IMDEA-Food Institute, Universidad Autónoma de Madrid, Consejo Superior de Investigaciones Científicas, Madrid, Spain
F. J. Lara-Abelenda Department of Medicine and Medical Specialities, Faculty of Medicine and Health Sciences, University of Alcala, Alcala de Henares, Spain Departamento Teoria de la Señal y Comunicaciones y Sistemas Telemáticos y Computación, Escuela Tecnica Superior de Ingenieria de Telecomunicación, Universidad Rey Juan Carlos, Fuenlabrada, Spain
Miguel Angel Ortega Department of Medicine and Medical Specialities, Faculty of Medicine and Health Sciences, University of Alcala, Alcala de Henares, Spain Ramón y Cajal Institute of Sanitary Research (IRYCIS), Madrid, Spain
Cielo Garcia-Montero Department of Medicine and Medical Specialities, Faculty of Medicine and Health Sciences, University of Alcala, Alcala de Henares, Spain Ramón y Cajal Institute of Sanitary Research (IRYCIS), Madrid, Spain
Fernando Mora Department of Psychiatry and Mental Health, Hospital Universitario Infanta Leonor, Madrid, Spain Department of Legal Medicine and Psychiatry, Complutense University, Madrid, Spain
Melchor Alvarez-Mon Department of Medicine and Medical Specialities, Faculty of Medicine and Health Sciences, University of Alcala, Alcala de Henares, Spain Ramón y Cajal Institute of Sanitary Research (IRYCIS), Madrid, Spain Service of Internal Medicine and Immune System Diseases-Rheumatology, University Hospital Príncipe de Asturias, (CIBEREHD), Alcalá de Henares, Spain
Javier Quintero Department of Psychiatry and Mental Health, Hospital Universitario Infanta Leonor, Madrid, Spain Department of Legal Medicine and Psychiatry, Complutense University, Madrid, Spain
Miguel Angel Alvarez-Mon Department of Psychiatry and Mental Health, Hospital Universitario Infanta Leonor, Madrid, Spain Department of Medicine and Medical Specialities, Faculty of Medicine and Health Sciences, University of Alcala, Alcala de Henares, Spain Ramón y Cajal Institute of Sanitary Research (IRYCIS), Madrid, Spain

Collapse

Parker MA, Valdez D, Rao VK, Eddens KS, Agley J. Results and Methodological Implications of the Digital Epidemiology of Prescription Drug References Among Twitter Users: Latent Dirichlet Allocation (LDA) Analyses. J Med Internet Res 2023;25:e48405. [PMID: 37505795 PMCID: PMC10422173 DOI: 10.2196/48405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 06/01/2023] [Accepted: 06/15/2023] [Indexed: 07/29/2023] Open

Abstract

BACKGROUND

Social media is an important information source for a growing subset of the population and can likely be leveraged to provide insight into the evolving drug overdose epidemic. Twitter can provide valuable insight into trends, colloquial information available to potential users, and how networks and interactivity might influence what people are exposed to and how they engage in communication around drug use.

OBJECTIVE

This exploratory study was designed to investigate the ways in which unsupervised machine learning analyses using natural language processing could identify coherent themes for tweets containing substance names.

METHODS

This study involved harnessing data from Twitter, including large-scale collection of brand name (N=262,607) and street name (N=204,068) prescription drug-related tweets and use of unsupervised machine learning analyses (ie, natural language processing) of collected data with data visualization to identify pertinent tweet themes. Latent Dirichlet allocation (LDA) with coherence score calculations was performed to compare brand (eg, OxyContin) and street (eg, oxys) name tweets.

RESULTS

We found people discussed drug use differently depending on whether a brand name or street name was used. Brand name categories often contained political talking points (eg, border, crime, and political handling of ongoing drug mitigation strategies). In contrast, categories containing street names occasionally referenced drug misuse, though multiple social uses for a term (eg, Sonata) muddled topic clarity.

CONCLUSIONS

Content in the brand name corpus reflected discussion about the drug itself and less often reflected personal use. However, content in the street name corpus was notably more diverse and resisted simple LDA categorization. We speculate this may reflect effective use of slang terminology to clandestinely discuss drug-related activity. If so, straightforward analyses of digital drug-related communication may be more difficult than previously assumed. This work has the potential to be used for surveillance and detection of harmful drug use information. It also might be used for appropriate education and dissemination of information to persons engaged in drug use content on Twitter.

Collapse

Fu R, Kundu A, Mitsakakis N, Elton-Marshall T, Wang W, Hill S, Bondy SJ, Hamilton H, Selby P, Schwartz R, Chaiton MO. Machine learning applications in tobacco research: a scoping review. Tob Control 2023;32:99-109. [PMID: 34452986 DOI: 10.1136/tobaccocontrol-2020-056438] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Accepted: 04/14/2021] [Indexed: 12/23/2022]

Diet during the COVID-19 pandemic: An analysis of Twitter data. PATTERNS 2022;3:100547. [PMID: 35721836 PMCID: PMC9197791 DOI: 10.1016/j.patter.2022.100547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 05/19/2022] [Accepted: 06/10/2022] [Indexed: 11/29/2022]

Abstract

In this study, we measured the association between county characteristics and changes in healthy-food, fast-food, and alcohol tweets during the COVID-19 pandemic in the United States. Our analytic dataset consisted of 1,282,316 geotagged tweets that referenced food consumption posted before (63.2%) and during (36.8%) the pandemic and included all US states. We found the share of healthy-food tweets increased by 20.5% during the pandemic compared with pre-pandemic, while fast-food and alcohol tweets decreased by 9.4% and 11.4%, respectively. We also observed that time spent at home and more grocery stores per capita were associated with increased odds of healthy-food tweets and decreased odds of fast-food tweets. More liquor stores per capita was associated with increased odds of alcohol tweets. Our results highlight the potential impact of the pandemic on nutrition and alcohol consumption and the association between the built environment and health behaviors.

•

We used Twitter data to quantify self-reported diet trends during the COVID-19 pandemic

•

Healthy food consumption increased during the pandemic; alcohol consumption decreased

•

Proximity to grocery stores and more time at home were associated with healthier diet

•

Proximity to liquor stores corresponded with increased alcohol consumption

The COVID-19 pandemic upended many aspects of daily life, including how we eat and drink. Restaurant closures and retail restrictions likely impacted individuals’ consumption habits, but longitudinal surveys that monitor nutrition and/or alcohol intake are costly to administer and are prone to response bias. In this study, we use digital trace data from Twitter to track population-level patterns in nutritional intake. Linking geotagged tweets to data measuring US county characteristics and built environment, this study finds that increased time at home and access to grocery stores during the pandemic may have promoted healthy-food consumption. This study also suggests that access to alcohol retail establishments may have led to more drinking. These findings validate the importance of the built environment to health behaviors while highlighting how social media data may be used to assess the impact of public health crises.

Collapse

Zhou H, Jia H, Lei G, Zhou T, Wu J, Chang Y, Wang L, Sheng M, Yang X. Quantitative assessment of normal hip cartilage in children under 9 years old by T2 mapping. MAGMA (NEW YORK, N.Y.) 2022;35:459-466. [PMID: 34652541 DOI: 10.1007/s10334-021-00962-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 09/19/2021] [Accepted: 09/21/2021] [Indexed: 06/13/2023]

Golder S, Stevens R, O'Connor K, James R, Gonzalez-Hernandez G. Methods to Establish Race or Ethnicity of Twitter Users: Scoping Review. J Med Internet Res 2022;24:e35788. [PMID: 35486433 PMCID: PMC9107046 DOI: 10.2196/35788] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 03/08/2022] [Accepted: 03/23/2022] [Indexed: 11/13/2022] Open

Abstract

Background

A growing amount of health research uses social media data. Those critical of social media research often cite that it may be unrepresentative of the population; however, the suitability of social media data in digital epidemiology is more nuanced. Identifying the demographics of social media users can help establish representativeness.

Objective

This study aims to identify the different approaches or combination of approaches to extract race or ethnicity from social media and report on the challenges of using these methods.

Methods

We present a scoping review to identify methods used to extract the race or ethnicity of Twitter users from Twitter data sets. We searched 17 electronic databases from the date of inception to May 15, 2021, and carried out reference checking and hand searching to identify relevant studies. Sifting of each record was performed independently by at least two researchers, with any disagreement discussed. Studies were required to extract the race or ethnicity of Twitter users using either manual or computational methods or a combination of both.

Results

Of the 1249 records sifted, we identified 67 (5.36%) that met our inclusion criteria. Most studies (51/67, 76%) have focused on US-based users and English language tweets (52/67, 78%). A range of data was used, including Twitter profile metadata, such as names, pictures, information from bios (including self-declarations), or location or content of the tweets. A range of methodologies was used, including manual inference, linkage to census data, commercial software, language or dialect recognition, or machine learning or natural language processing. However, not all studies have evaluated these methods. Those that evaluated these methods found accuracy to vary from 45% to 93% with significantly lower accuracy in identifying categories of people of color. The inference of race or ethnicity raises important ethical questions, which can be exacerbated by the data and methods used. The comparative accuracies of the different methods are also largely unknown.

Conclusions

There is no standard accepted approach or current guidelines for extracting or inferring the race or ethnicity of Twitter users. Social media researchers must carefully interpret race or ethnicity and not overpromise what can be achieved, as even manual screening is a subjective, imperfect method. Future research should establish the accuracy of methods to inform evidence-based best practice guidelines for social media researchers and be guided by concerns of equity and social justice.

Collapse

Jiang L, Huang Y, Cheng H, Zhang T, Huang L. Emergency Response and Risk Communication Effects of Local Media during COVID-19 Pandemic in China: A Study Based on a Social Media Network. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021;18:10942. [PMID: 34682685 PMCID: PMC8535417 DOI: 10.3390/ijerph182010942] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 10/13/2021] [Accepted: 10/13/2021] [Indexed: 01/23/2023]

Kasson E, Singh AK, Huang M, Wu D, Cavazos-Rehg P. Using a mixed methods approach to identify public perception of vaping risks and overall health outcomes on Twitter during the 2019 EVALI outbreak. Int J Med Inform 2021;155:104574. [PMID: 34592539 DOI: 10.1016/j.ijmedinf.2021.104574] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 08/30/2021] [Accepted: 09/10/2021] [Indexed: 12/24/2022]

Kino S, Hsu YT, Shiba K, Chien YS, Mita C, Kawachi I, Daoud A. A scoping review on the use of machine learning in research on social determinants of health: Trends and research prospects. SSM Popul Health 2021;15:100836. [PMID: 34169138 PMCID: PMC8207228 DOI: 10.1016/j.ssmph.2021.100836] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Revised: 05/15/2021] [Accepted: 06/01/2021] [Indexed: 02/08/2023] Open

Singh T, Roberts K, Cohen T, Cobb N, Wang J, Fujimoto K, Myneni S. Social Media as a Research Tool (SMaaRT) for Risky Behavior Analytics: Methodological Review. JMIR Public Health Surveill 2020;6:e21660. [PMID: 33252345 PMCID: PMC7735906 DOI: 10.2196/21660] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2020] [Revised: 10/05/2020] [Accepted: 11/06/2020] [Indexed: 12/11/2022] Open

Abstract

BACKGROUND

Modifiable risky health behaviors, such as tobacco use, excessive alcohol use, being overweight, lack of physical activity, and unhealthy eating habits, are some of the major factors for developing chronic health conditions. Social media platforms have become indispensable means of communication in the digital era. They provide an opportunity for individuals to express themselves, as well as share their health-related concerns with peers and health care providers, with respect to risky behaviors. Such peer interactions can be utilized as valuable data sources to better understand inter-and intrapersonal psychosocial mediators and the mechanisms of social influence that drive behavior change.

OBJECTIVE

The objective of this review is to summarize computational and quantitative techniques facilitating the analysis of data generated through peer interactions pertaining to risky health behaviors on social media platforms.

METHODS

We performed a systematic review of the literature in September 2020 by searching three databases-PubMed, Web of Science, and Scopus-using relevant keywords, such as "social media," "online health communities," "machine learning," "data mining," etc. The reporting of the studies was directed by the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. Two reviewers independently assessed the eligibility of studies based on the inclusion and exclusion criteria. We extracted the required information from the selected studies.

RESULTS

The initial search returned a total of 1554 studies, and after careful analysis of titles, abstracts, and full texts, a total of 64 studies were included in this review. We extracted the following key characteristics from all of the studies: social media platform used for conducting the study, risky health behavior studied, the number of posts analyzed, study focus, key methodological functions and tools used for data analysis, evaluation metrics used, and summary of the key findings. The most commonly used social media platform was Twitter, followed by Facebook, QuitNet, and Reddit. The most commonly studied risky health behavior was nicotine use, followed by drug or substance abuse and alcohol use. Various supervised and unsupervised machine learning approaches were used for analyzing textual data generated from online peer interactions. Few studies utilized deep learning methods for analyzing textual data as well as image or video data. Social network analysis was also performed, as reported in some studies.

CONCLUSIONS

Our review consolidates the methodological underpinnings for analyzing risky health behaviors and has enhanced our understanding of how social media can be leveraged for nuanced behavioral modeling and representation. The knowledge gained from our review can serve as a foundational component for the development of persuasive health communication and effective behavior modification technologies aimed at the individual and population levels.

Collapse

Hockenhull J, Black JC, Bletz A, Margolin Z, Olson R, Wood DM, Dart RC, Dargan PI. An evaluation of online discussion relating to nonmedical use of prescription opioids within the UK. Br J Clin Pharmacol 2020;87:1637-1646. [PMID: 33464643 DOI: 10.1111/bcp.14603] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 08/21/2020] [Accepted: 09/30/2020] [Indexed: 01/06/2023] Open

van Draanen J, Tao H, Gupta S, Liu S. Geographic Differences in Cannabis Conversations on Twitter: Infodemiology Study. JMIR Public Health Surveill 2020;6:e18540. [PMID: 33016888 PMCID: PMC7573699 DOI: 10.2196/18540] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2020] [Revised: 08/28/2020] [Accepted: 08/31/2020] [Indexed: 11/13/2022] Open

Abstract

BACKGROUND

Infodemiology is an emerging field of research that utilizes user-generated health-related content, such as that found in social media, to help improve public health. Twitter has become an important venue for studying emerging patterns in health issues such as substance use because it can reflect trends in real-time and display messages generated directly by users, giving a uniquely personal voice to analyses. Over the past year, several states in the United States have passed legislation to legalize adult recreational use of cannabis and the federal government in Canada has done the same. There are few studies that examine the sentiment and content of tweets about cannabis since the recent legislative changes regarding cannabis have occurred in North America.

OBJECTIVE

To examine differences in the sentiment and content of cannabis-related tweets by state cannabis laws, and to examine differences in sentiment between the United States and Canada between 2017 and 2019.

METHODS

In total, 1,200,127 cannabis-related tweets were collected from January 1, 2017, to June 17, 2019, using the Twitter application programming interface. Tweets then were grouped geographically based on cannabis legal status (legal for adult recreational use, legal for medical use, and no legal use) in the locations from which the tweets came. Sentiment scoring for the tweets was done with VADER (Valence Aware Dictionary and sEntiment Reasoner), and differences in sentiment for states with different cannabis laws were tested using Tukey adjusted two-sided pairwise comparisons. Topic analysis to determine the content of tweets was done using latent Dirichlet allocation in Python, using a Java implementation, LdaMallet, with Gensim wrapper.

RESULTS

Significant differences were seen in tweet sentiment between US states with different cannabis laws (P=.001 for negative sentiment tweets in fully illegal compared to legal for adult recreational use states), as well as between the United States and Canada (P=.003 for positive sentiment and P=.001 for negative sentiment). In both cases, restrictive state policy environments (eg, those where cannabis use is fully illegal, or legal for medical use only) were associated with more negative tweet sentiment than less restrictive policy environments (eg, where cannabis is legal for adult recreational use). Six key topics were found in recent US tweet contents: fun and recreation (keywords, eg, love, life, high); daily life (today, start, live); transactions (buy, sell, money); places of use (room, car, house); medical use and cannabis industry (business, industry, company); and legalization (legalize, police, tax). The keywords representing content of tweets also differed between the United States and Canada.

CONCLUSIONS

Knowledge about how cannabis is being discussed online, and geographic differences that exist in these conversations may help to inform public health planning and prevention efforts. Public health education about how to use cannabis in ways that promote safety and minimize harms may be especially important in places where cannabis is legal for adult recreational and medical use.

Collapse

Cai M, Shah N, Li J, Chen WH, Cuomo RE, Obradovich N, Mackey TK. Identification and characterization of tweets related to the 2015 Indiana HIV outbreak: A retrospective infoveillance study. PLoS One 2020;15:e0235150. [PMID: 32845882 PMCID: PMC7449407 DOI: 10.1371/journal.pone.0235150] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 04/20/2020] [Indexed: 11/29/2022] Open

Abstract

INTRODUCTION

From late 2014 through 2015, Scott County, Indiana faced an HIV outbreak triggered by opioid abuse and transition to injection drug use. Investigating the origins, risk factors, and responses related to this outbreak is critical to inform future surveillance, interventions, and policymaking. In response, this retrospective infoveillance study identifies and characterizes user-generated messages related to opioid abuse, heroin injection drug use, and HIV status using natural language processing (NLP) among Twitter users in Indiana during the period of this HIV outbreak.

MATERIALS AND METHODS

Our study consisted of two phases: data collection and processing, and data analysis. We collected Indiana geolocated tweets from the public Twitter API using Amazon Web Services EC2 instances filtered for geocoded messages in the immediate pre and post period of the outbreak. In the data analysis phase we applied an unsupervised machine learning approach using NLP called the Biterm Topic Model (BTM) to identify tweets related to opioid, heroin/injection, and HIV behavior and then examined these messages for HIV risk-related topics that could be associated with the outbreak.

RESULTS

More than 10 million geocoded tweets occurring in Indiana during the immediate pre and post period of the outbreak were collected for analysis. Using BTM, we identified 1350 tweets thought to be relevant to the outbreak and then confirmed 358 tweets using human annotation. The most prevalent themes identified were tweets related to self-reported abuse of illicit and prescription drugs, opioid use disorder, self-reported HIV status, and public sentiment regarding the outbreak. Geospatial analysis found that these messages clustered in population dense areas outside of the outbreak, including Indianapolis and neighboring Clark County.

DISCUSSION

This infoveillance study characterized the social media conversations of communities in Indiana in the pre and post period of the 2015 HIV outbreak. Behavioral themes detected reflect discussion about risk factors related to HIV transmission stemming from opioid and heroin abuse for priority populations, and also help identify community attitudes that could have motivated or detracted the use of HIV prevention methods, along with helping identify factors that can impede access to prevention services.

CONCLUSIONS

Infoveillance approaches, such as the analysis conducted in this study, represent a possibly strategy to detect "signal" of the emergence of risk factors associated with an outbreak though may be limited in their scope and generalizability. Our results, in conjunction with other forms of public health surveillance, can leverage the growing ubiquity of social media platforms to better detect opioid-related HIV risk knowledge, attitudes and behavior, as well as inform future prevention efforts.

Collapse

Affiliation(s)

Mingxiang Cai Global Health Policy Institute, San Diego, CA, United States of America Department of Healthcare Research and Policy, University of California, San Diego, CA, United States of America Department of Computer Science and Engineering, University of California, San Diego, CA, United States of America
Neal Shah Global Health Policy Institute, San Diego, CA, United States of America Department of Healthcare Research and Policy, University of California, San Diego, CA, United States of America
Jiawei Li Global Health Policy Institute, San Diego, CA, United States of America Department of Healthcare Research and Policy, University of California, San Diego, CA, United States of America Department of Computational Science, Mathematics and Engineering, University of California, San Diego, CA, United States of America
Wen-Hao Chen Department of Healthcare Research and Policy, University of California, San Diego, CA, United States of America Department of Computer Science and Engineering, University of California, San Diego, CA, United States of America
Raphael E. Cuomo Global Health Policy Institute, San Diego, CA, United States of America Department of Anesthesiology, San Diego School of Medicine, University of California, San Diego, CA, United States of America
Nick Obradovich Max Planck Institute for Human Development, Berlin, Germany
Tim K. Mackey Global Health Policy Institute, San Diego, CA, United States of America Department of Healthcare Research and Policy, University of California, San Diego, CA, United States of America Department of Anesthesiology, San Diego School of Medicine, University of California, San Diego, CA, United States of America Division of Infections Disease and Global Public Health, Department of Medicine, San Diego School of Medicine, University of California, San Diego, CA, United States of America

Collapse

Atique S, Bautista JR, Block LJ, Lee JJ, Lozada-Perezmitre E, Nibber R, O'Connor S, Peltonen LM, Ronquillo C, Tayaben J, Thilo FJS, Topaz M. A nursing informatics response to COVID-19: Perspectives from five regions of the world. J Adv Nurs 2020;76:2462-2468. [PMID: 32420652 PMCID: PMC7276900 DOI: 10.1111/jan.14417] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Accepted: 05/12/2020] [Indexed: 01/28/2023]

Cao Y, Stewart K, Factor J, Billing A, Massey E, Artigiani E, Wagner M, Dezman Z, Wish E. Using socially-sensed data to infer ZIP level characteristics for the spatiotemporal analysis of drug-related health problems in Maryland. Health Place 2020;63:102345. [PMID: 32543431 DOI: 10.1016/j.healthplace.2020.102345] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Revised: 04/02/2020] [Accepted: 04/14/2020] [Indexed: 01/07/2023]

Kim MG, Kim J, Kim SC, Jeong J. Twitter Analysis of the Nonmedical Use and Side Effects of Methylphenidate: Machine Learning Study. J Med Internet Res 2020;22:e16466. [PMID: 32130160 PMCID: PMC7063527 DOI: 10.2196/16466] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2019] [Revised: 01/08/2020] [Accepted: 01/27/2020] [Indexed: 01/20/2023] Open

Abstract

Background

Methylphenidate, a stimulant used to treat attention deficit hyperactivity disorder, has the potential to be used nonmedically, such as for studying and recreation. In an era when many people actively use social networking services, experience with the nonmedical use or side effects of methylphenidate might be shared on Twitter.

Objective

The purpose of this study was to analyze tweets about the nonmedical use and side effects of methylphenidate using a machine learning approach.

Methods

A total of 34,293 tweets mentioning methylphenidate from August 2018 to July 2019 were collected using searches for “methylphenidate” and its brand names. Tweets in a randomly selected training dataset (6860/34,293, 20.00%) were annotated as positive or negative for two dependent variables: nonmedical use and side effects. Features such as personal noun, nonmedical use terms, medical use terms, side effect terms, sentiment scores, and the presence of a URL were generated for supervised learning. Using the labeled training dataset and features, support vector machine (SVM) classifiers were built and the performance was evaluated using F₁ scores. The classifiers were applied to the test dataset to determine the number of tweets about nonmedical use and side effects.

Results

Of the 6860 tweets in the training dataset, 5.19% (356/6860) and 5.52% (379/6860) were about nonmedical use and side effects, respectively. Performance of SVM classifiers for nonmedical use and side effects, expressed as F₁ scores, were 0.547 (precision: 0.926, recall: 0.388, and accuracy: 0.967) and 0.733 (precision: 0.920, recall: 0.609, and accuracy: 0.976), respectively. In the test dataset, the SVM classifiers identified 361 tweets (1.32%) about nonmedical use and 519 tweets (1.89%) about side effects. The proportion of tweets about nonmedical use was highest in May 2019 (46/2624, 1.75%) and December 2018 (36/2041, 1.76%).

Conclusions

The SVM classifiers that were built in this study were highly precise and accurate and will help to automatically identify the nonmedical use and side effects of methylphenidate using Twitter.

Collapse

Hu H, Phan N, Chun SA, Geller J, Vo H, Ye X, Jin R, Ding K, Kenne D, Dou D. An insight analysis and detection of drug-abuse risk behavior on Twitter with self-taught deep learning. COMPUTATIONAL SOCIAL NETWORKS 2019. [DOI: 10.1186/s40649-019-0071-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Klein AZ, Sarker A, Cai H, Weissenbacher D, Gonzalez-Hernandez G. Social media mining for birth defects research: A rule-based, bootstrapping approach to collecting data for rare health-related events on Twitter. J Biomed Inform 2018;87:68-78. [PMID: 30292855 PMCID: PMC6295660 DOI: 10.1016/j.jbi.2018.10.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2018] [Revised: 09/26/2018] [Accepted: 10/03/2018] [Indexed: 10/28/2022]

Abstract

BACKGROUND

Although birth defects are the leading cause of infant mortality in the United States, methods for observing human pregnancies with birth defect outcomes are limited.

OBJECTIVE

The primary objectives of this study were (i) to assess whether rare health-related events-in this case, birth defects-are reported on social media, (ii) to design and deploy a natural language processing (NLP) approach for collecting such sparse data from social media, and (iii) to utilize the collected data to discover a cohort of women whose pregnancies with birth defect outcomes could be observed on social media for epidemiological analysis.

METHODS

To assess whether birth defects are mentioned on social media, we mined 432 million tweets posted by 112,647 users who were automatically identified via their public announcements of pregnancy on Twitter. To retrieve tweets that mention birth defects, we developed a rule-based, bootstrapping approach, which relies on a lexicon, lexical variants generated from the lexicon entries, regular expressions, post-processing, and manual analysis guided by distributional properties. To identify users whose pregnancies with birth defect outcomes could be observed for epidemiological analysis, inclusion criteria were (i) tweets indicating that the user's child has a birth defect, and (ii) accessibility to the user's tweets during pregnancy. We conducted a semi-automatic evaluation to estimate the recall of the tweet-collection approach, and performed a preliminary assessment of the prevalence of selected birth defects among the pregnancy cohort derived from Twitter.

RESULTS

We manually annotated 16,822 retrieved tweets, distinguishing tweets indicating that the user's child has a birth defect (true positives) from tweets that merely mention birth defects (false positives). Inter-annotator agreement was substantial: κ = 0.79 (Cohen's kappa). Analyzing the timelines of the 646 users whose tweets were true positives resulted in the discovery of 195 users that met the inclusion criteria. Congenital heart defects are the most common type of birth defect reported on Twitter, consistent with findings in the general population. Based on an evaluation of 4169 tweets retrieved using alternative text mining methods, the recall of the tweet-collection approach was 0.95.

CONCLUSIONS

Our contributions include (i) evidence that rare health-related events are indeed reported on Twitter, (ii) a generalizable, systematic NLP approach for collecting sparse tweets, (iii) a semi-automatic method to identify undetected tweets (false negatives), and (iv) a collection of publicly available tweets by pregnant users with birth defect outcomes, which could be used for future epidemiological analysis. In future work, the annotated tweets could be used to train machine learning algorithms to automatically identify users reporting birth defect outcomes, enabling the large-scale use of social media mining as a complementary method for such epidemiological research.

Collapse