1
|
Webster JL, Lakamana S, Ge Y, Sarker A. "I Been Taking Adderall Mixing it With Lean, Hope I Don't Wake Up Out My Sleep": Harnessing Twitter to Understand Nonmedical Prescription Stimulant Use among Black Women and Men Subscribers. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.12.03.24318408. [PMID: 39677440 PMCID: PMC11643189 DOI: 10.1101/2024.12.03.24318408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/17/2024]
Abstract
Black women and men outpace other races for stimulant-involved overdose mortality despite lower lifetime use. Growth in mortality from prescription stimulant medications is increasing in tandem with prescribing patterns for these medications. We used Twitter to explore nonmedical prescription stimulant use (NMPSU) among Black women and men using emotion and sentiment analysis, and topic modeling. We applied the NRC Lexicon and VADER dictionary, and LDA topic modeling to examine feelings and themes in conversations about NMPSU by gender. We paid attention to the ability of natural language processing techniques to detect differences in emotion and sentiment among Black Twitter subscribers given increased mortality from stimulants. We found that, although emotion and sentiment outcomes match the directionality of emotions and sentiment observed (i.e., Black Twitter subscribers use more positive language in tweets), this belies limitations of NRC and VADER dictionaries to distinguish feelings for Black people. Even still, LDA topic models showcased the relevance of hip-hop, dependence on NMPSU, and recreational use as consequential to Black Twitter subscribers' discussions. However, gender shaped the relevance of these topics for each group. Greater attention needs to be paid to how Black women and men use social media to discuss important topics like drug use. Natural language processing methods and social media research should include larger proportions of Black, Hispanic/Latinx, and American Indian populations in development of emotion and sentiment lexicons, otherwise outcomes regarding NMPSU will not be generalizable to populations writ large due to cultural differences in communication about drug use online.
Collapse
Affiliation(s)
| | - Sahithi Lakamana
- Department of Biomedical Informatics, School of Medicine, Emory University
| | - Yao Ge
- Department of Biomedical Informatics, School of Medicine, Emory University
| | - Abeed Sarker
- Department of Biomedical Informatics, School of Medicine, Emory University
| |
Collapse
|
2
|
Arif M, Shahiki Tash M, Jamshidi A, Ullah F, Ameer I, Kalita J, Gelbukh A, Balouchzahi F. Analyzing hope speech from psycholinguistic and emotional perspectives. Sci Rep 2024; 14:23548. [PMID: 39384851 PMCID: PMC11464665 DOI: 10.1038/s41598-024-74630-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Accepted: 09/27/2024] [Indexed: 10/11/2024] Open
Abstract
Hope is a vital coping mechanism, enabling individuals to effectively confront life's challenges. This study proposes a technique employing Natural Language Processing (NLP) tools like Linguistic Inquiry and Word Count (LIWC), NRC-emotion-lexicon, and vaderSentiment to analyze social media posts, extracting psycholinguistic, emotional, and sentimental features from a hope speech dataset. The findings of this study reveal distinct cognitive, emotional, and communicative characteristics and psycholinguistic dimensions, emotions, and sentiments associated with different types of hope shared in social media. Furthermore, the study investigates the potential of leveraging this data to classify different types of hope using machine learning algorithms. Notably, models such as LightGBM and CatBoost demonstrate impressive performance, surpassing traditional methods and competing effectively with deep learning techniques. We employed hyperparameter tuning to optimize the models' parameters and compared their performance using both default and tuned settings. The results highlight the enhanced efficiency achieved through hyperparameter tuning for these models.
Collapse
Affiliation(s)
- Muhammad Arif
- Centro de Investigación en Computación (CIC), Instituto Politécnico Nacional (IPN), Mexico City, Mexico
| | - Moein Shahiki Tash
- Centro de Investigación en Computación (CIC), Instituto Politécnico Nacional (IPN), Mexico City, Mexico
| | - Ainaz Jamshidi
- Department of Information Systems, University of Maryland Baltimore County (UMBC), Baltimore, USA
| | - Fida Ullah
- Centro de Investigación en Computación (CIC), Instituto Politécnico Nacional (IPN), Mexico City, Mexico
| | - Iqra Ameer
- Division of Engineering and Science at Abington, The Pennsylvania State University, University Park, USA
| | | | - Alexander Gelbukh
- Centro de Investigación en Computación (CIC), Instituto Politécnico Nacional (IPN), Mexico City, Mexico
| | - Fazlourrahman Balouchzahi
- Centro de Investigación en Computación (CIC), Instituto Politécnico Nacional (IPN), Mexico City, Mexico.
| |
Collapse
|
3
|
Carpenter KA, Nguyen AT, Smith DA, Samori IA, Humphreys K, Lembke A, Kiang MV, Eichstaedt JC, Altman RB. Which social media platforms facilitate monitoring the opioid crisis? MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.07.06.24310035. [PMID: 39006412 PMCID: PMC11245080 DOI: 10.1101/2024.07.06.24310035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
Social media can provide real-time insight into trends in substance use, addiction, and recovery. Prior studies have used platforms such as Reddit and X (formerly Twitter), but evolving policies around data access have threatened these platforms' usability in research. We evaluate the potential of a broad set of platforms to detect emerging trends in the opioid epidemic. From these, we created a shortlist of 11 platforms, for which we documented official policies regulating drug-related discussion, data accessibility, geolocatability, and prior use in opioid-related studies. We quantified their volumes of opioid discussion, capturing informal language by including slang generated using a large language model. Beyond the most commonly used Reddit and X, the platforms with high potential for use in opioid-related surveillance are TikTok, YouTube, and Facebook. Leveraging many different social platforms, instead of a single platform, safeguards against sudden changes to data access and may better capture all populations that use opioids than any single platform.
Collapse
|
4
|
Rao VK, Valdez D, Muralidharan R, Agley J, Eddens KS, Dendukuri A, Panth V, Parker MA. Digital Epidemiology of Prescription Drug References on X (Formerly Twitter): Neural Network Topic Modeling and Sentiment Analysis. J Med Internet Res 2024; 26:e57885. [PMID: 39178036 PMCID: PMC11380061 DOI: 10.2196/57885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 06/12/2024] [Accepted: 07/01/2024] [Indexed: 08/24/2024] Open
Abstract
BACKGROUND Data from the social media platform X (formerly Twitter) can provide insights into the types of language that are used when discussing drug use. In past research using latent Dirichlet allocation (LDA), we found that tweets containing "street names" of prescription drugs were difficult to classify due to the similarity to other colloquialisms and lack of clarity over how the terms were used. Conversely, "brand name" references were more amenable to machine-driven categorization. OBJECTIVE This study sought to use next-generation techniques (beyond LDA) from natural language processing to reprocess X data and automatically cluster groups of tweets into topics to differentiate between street- and brand-name data sets. We also aimed to analyze the differences in emotional valence between the 2 data sets to study the relationship between engagement on social media and sentiment. METHODS We used the Twitter application programming interface to collect tweets that contained the street and brand name of a prescription drug within the tweet. Using BERTopic in combination with Uniform Manifold Approximation and Projection and k-means, we generated topics for the street-name corpus (n=170,618) and brand-name corpus (n=245,145). Valence Aware Dictionary and Sentiment Reasoner (VADER) scores were used to classify whether tweets within the topics had positive, negative, or neutral sentiments. Two different logistic regression classifiers were used to predict the sentiment label within each corpus. The first model used a tweet's engagement metrics and topic ID to predict the label, while the second model used those features in addition to the top 5000 tweets with the largest term-frequency-inverse document frequency score. RESULTS Using BERTopic, we identified 40 topics for the street-name data set and 5 topics for the brand-name data set, which we generalized into 8 and 5 topics of discussion, respectively. Four of the general themes of discussion in the brand-name corpus referenced drug use, while 2 themes of discussion in the street-name corpus referenced drug use. From the VADER scores, we found that both corpora were inclined toward positive sentiment. Adding the vectorized tweet text increased the accuracy of our models by around 40% compared with the models that did not incorporate the tweet text in both corpora. CONCLUSIONS BERTopic was able to classify tweets well. As with LDA, the discussion using brand names was more similar between tweets than the discussion using street names. VADER scores could only be logically applied to the brand-name corpus because of the high prevalence of non-drug-related topics in the street-name data. Brand-name tweets either discussed drugs positively or negatively, with few posts having a neutral emotionality. From our machine learning models, engagement alone was not enough to predict the sentiment label; the added context from the tweets was needed to understand the emotionality of a tweet.
Collapse
Affiliation(s)
- Varun K Rao
- Department of Epidemiology & Biostatistics, School of Public Health Bloomington, Indiana University Bloomington, Bloomington, IN, United States
| | - Danny Valdez
- Department of Applied Health Science, School of Public Health Bloomington, Indiana University Bloomington, Bloomington, IN, United States
| | - Rasika Muralidharan
- Luddy School of Informatics, Computing and Engineering, Indiana University Bloomington, Bloomington, IN, United States
| | - Jon Agley
- Department of Applied Health Science, School of Public Health Bloomington, Indiana University Bloomington, Bloomington, IN, United States
| | - Kate S Eddens
- Department of Epidemiology & Biostatistics, School of Public Health Bloomington, Indiana University Bloomington, Bloomington, IN, United States
| | - Aravind Dendukuri
- Luddy School of Informatics, Computing and Engineering, Indiana University Bloomington, Bloomington, IN, United States
| | - Vandana Panth
- Luddy School of Informatics, Computing and Engineering, Indiana University Bloomington, Bloomington, IN, United States
| | - Maria A Parker
- Department of Applied Health Science, School of Public Health Bloomington, Indiana University Bloomington, Bloomington, IN, United States
| |
Collapse
|
5
|
Lamy FR, Meemon N. Exploring Twitter chatter to assess the type and availability of cannabis-related products in Thailand. J Ethn Subst Abuse 2024:1-21. [PMID: 38949657 DOI: 10.1080/15332640.2024.2367253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Cannabis-related tweets were collected between January and April 2022 to estimate the availability and characteristics of cannabis products advertised on Twitter amid the legalization of recreational cannabis in Thailand. The Twitter API was called using the tweepy Python library to collect cannabis-related tweets in the Thai language. A total of 185,558 unique tweets were collected over the duration of the data collection period based on 83 search terms. Twenty thousand random tweets were manually coded by four Thai native speakers to assess the volume and characteristics of tweets proposing cannabis. 72.6% of collected tweets from the 20,000 random samples were coded as relevant to the study. 54.6% of relevant tweets were advertising cannabis products, 29.8% were personal communications, and 15.6% were related to news or media content. Among the tweets that advertised cannabis products, 94.4% proposed cannabis flower, 2.4% cannabis edibles and 1.8% cannabis concentrates. Consumption of potent forms of cannabis such as cannabis edibles and concentrates increase the risk of harmful side-effects, especially in a population with limited knowledge about these products. Our findings call for additional monitoring efforts and for increasing the public awareness on potent cannabis products emerging in Thailand.
Collapse
Affiliation(s)
- Francois R Lamy
- Department of Society and Health, Faculty of Social Sciences and Humanities, Mahidol University, Salaya, Thailand
- Health Solutions Research Unit, Faculty of Social Sciences and Humanities, Mahidol University, Salaya, Thailand
| | - Natthani Meemon
- Department of Society and Health, Faculty of Social Sciences and Humanities, Mahidol University, Salaya, Thailand
| |
Collapse
|
6
|
Walker AL, LoParco C, Rossheim ME, Livingston MD. #Delta8: a retailer-driven increase in Delta-8 THC discussions on Twitter from 2020 to 2021. THE AMERICAN JOURNAL OF DRUG AND ALCOHOL ABUSE 2023; 49:491-499. [PMID: 37433117 PMCID: PMC11022156 DOI: 10.1080/00952990.2023.2222433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Revised: 06/02/2023] [Accepted: 06/04/2023] [Indexed: 07/13/2023]
Abstract
Background: Delta-8 tetrahydrocannabinol (THC) has experienced significant cultivation, use, and online marketing growth in recent years.Objectives: This study utilized natural language processing on Twitter data to examine trends in public discussions regarding this novel psychoactive substance.Methods: This study analyzed the frequency of #Delta8 tweets over time, most commonly used words, sentiment classification of words in tweets, and a qualitative analysis of a random sample of tweets containing the hashtag "Delta8" from January 1, 2020 to September 26, 2021.Results: A total of 41,828 tweets were collected, with 30,826 unique tweets (73.7%) and 11,002 quotes, retweets, or replies (26.3%). Tweet activity increased from 2020 to 2021, with daily original tweets rising from 8.55 to 149. This increase followed a high-engagement retailer promotion in June 2021. Commonly used terms included "cbd," "cannabis," "edibles," and "cbdoil." Sentiment classification revealed a predominance of "positive" (30.93%) and "trust" (14.26%) categorizations, with 8.42% classified as "negative." Qualitative analysis identified 20 codes, encompassing substance type, retailers, links, and other characteristics.Conclusion: Twitter discussions on Delta-8 THC exhibited a sustained increase in prevalence from 2020 to 2022, with online retailers playing a dominant role. The content also demonstrated significant overlap with cannabidiol and various cannabis products. Given the growing presence of retailer marketing and sales on social media, it is crucial for public health researchers to monitor and promote relevant Delta-8 health recommendations on these platforms to ensure a balanced conversation.
Collapse
Affiliation(s)
- Andrew L. Walker
- Department of Behavioral, Social, and Health Education Sciences, Rollins School of Public Health, Emory University, Atlanta, GA, USA
| | - Cassidy LoParco
- Department of Health Behavior and Health Systems, School of Public Health, University of North Texas Health Science Center, Fort Worth, TX, USA
| | - Matthew E. Rossheim
- Department of Health Behavior and Health Systems, School of Public Health, University of North Texas Health Science Center, Fort Worth, TX, USA
| | - Melvin D. Livingston
- Department of Behavioral, Social, and Health Education Sciences, Rollins School of Public Health, Emory University, Atlanta, GA, USA
| |
Collapse
|