1
|
Castillo-Toledo C, Fraile-Martínez O, Donat-Vargas C, Lara-Abelenda FJ, Ortega MA, Garcia-Montero C, Mora F, Alvarez-Mon M, Quintero J, Alvarez-Mon MA. Insights from the Twittersphere: a cross-sectional study of public perceptions, usage patterns, and geographical differences of tweets discussing cocaine. Front Psychiatry 2024; 15:1282026. [PMID: 38566955 PMCID: PMC10986306 DOI: 10.3389/fpsyt.2024.1282026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 02/27/2024] [Indexed: 04/04/2024] Open
Abstract
Introduction Cocaine abuse represents a major public health concern. The social perception of cocaine has been changing over the decades, a phenomenon closely tied to its patterns of use and abuse. Twitter is a valuable tool to understand the status of drug use and abuse globally. However, no specific studies discussing cocaine have been conducted on this platform. Methods 111,508 English and Spanish tweets containing "cocaine" from 2018 to 2022 were analyzed. 550 were manually studied, and the largest subset underwent automated classification. Then, tweets related to cocaine were analyzed to examine their content, types of Twitter users, usage patterns, health effects, and personal experiences. Geolocation data was also considered to understand regional differences. Results A total of 71,844 classifiable tweets were obtained. Among these, 15.95% of users discussed the harm of cocaine consumption to health. Media outlets had the highest number of tweets (35.11%) and the most frequent theme was social/political denunciation (67.88%). Regarding the experience related to consumption, there are more tweets with a negative sentiment. The 9.03% of tweets explicitly mention frequent use of the drug. The continent with the highest number of tweets was America (55.44% of the total). Discussion The findings underscore the significance of cocaine as a current social and political issue, with a predominant focus on political and social denunciation in the majority of tweets. Notably, the study reveals a concentration of tweets from the United States and South American countries, reflecting the high prevalence of cocaine-related disorders and overdose cases in these regions. Alarmingly, the study highlights the trivialization of cocaine consumption on Twitter, accompanied by a misleading promotion of its health benefits, emphasizing the urgent need for targeted interventions and antidrug content on social media platforms. Finally, the unexpected advocacy for cocaine by healthcare professionals raises concerns about potential drug abuse within this demographic, warranting further investigation.
Collapse
Affiliation(s)
- Consuelo Castillo-Toledo
- Department of Psychiatry and Mental Health, Hospital Universitario Infanta Leonor, Madrid, Spain
- Department of Medicine and Medical Specialities, Faculty of Medicine and Health Sciences, University of Alcala, Alcala de Henares, Spain
| | - Oscar Fraile-Martínez
- Department of Medicine and Medical Specialities, Faculty of Medicine and Health Sciences, University of Alcala, Alcala de Henares, Spain
- Ramón y Cajal Institute of Sanitary Research (IRYCIS), Madrid, Spain
| | - Carolina Donat-Vargas
- Cardiovascular and Nutritional Epidemiology, Institute of Environmental Medicine, Karolinska Institute, Stockholm, Sweden
- IMDEA-Food Institute, Universidad Autónoma de Madrid, Consejo Superior de Investigaciones Científicas, Madrid, Spain
| | - F. J. Lara-Abelenda
- Department of Medicine and Medical Specialities, Faculty of Medicine and Health Sciences, University of Alcala, Alcala de Henares, Spain
- Departamento Teoria de la Señal y Comunicaciones y Sistemas Telemáticos y Computación, Escuela Tecnica Superior de Ingenieria de Telecomunicación, Universidad Rey Juan Carlos, Fuenlabrada, Spain
| | - Miguel Angel Ortega
- Department of Medicine and Medical Specialities, Faculty of Medicine and Health Sciences, University of Alcala, Alcala de Henares, Spain
- Ramón y Cajal Institute of Sanitary Research (IRYCIS), Madrid, Spain
| | - Cielo Garcia-Montero
- Department of Medicine and Medical Specialities, Faculty of Medicine and Health Sciences, University of Alcala, Alcala de Henares, Spain
- Ramón y Cajal Institute of Sanitary Research (IRYCIS), Madrid, Spain
| | - Fernando Mora
- Department of Psychiatry and Mental Health, Hospital Universitario Infanta Leonor, Madrid, Spain
- Department of Legal Medicine and Psychiatry, Complutense University, Madrid, Spain
| | - Melchor Alvarez-Mon
- Department of Medicine and Medical Specialities, Faculty of Medicine and Health Sciences, University of Alcala, Alcala de Henares, Spain
- Ramón y Cajal Institute of Sanitary Research (IRYCIS), Madrid, Spain
- Service of Internal Medicine and Immune System Diseases-Rheumatology, University Hospital Príncipe de Asturias, (CIBEREHD), Alcalá de Henares, Spain
| | - Javier Quintero
- Department of Psychiatry and Mental Health, Hospital Universitario Infanta Leonor, Madrid, Spain
- Department of Legal Medicine and Psychiatry, Complutense University, Madrid, Spain
| | - Miguel Angel Alvarez-Mon
- Department of Psychiatry and Mental Health, Hospital Universitario Infanta Leonor, Madrid, Spain
- Department of Medicine and Medical Specialities, Faculty of Medicine and Health Sciences, University of Alcala, Alcala de Henares, Spain
- Ramón y Cajal Institute of Sanitary Research (IRYCIS), Madrid, Spain
| |
Collapse
|
2
|
Parker MA, Valdez D, Rao VK, Eddens KS, Agley J. Results and Methodological Implications of the Digital Epidemiology of Prescription Drug References Among Twitter Users: Latent Dirichlet Allocation (LDA) Analyses. J Med Internet Res 2023; 25:e48405. [PMID: 37505795 PMCID: PMC10422173 DOI: 10.2196/48405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 06/01/2023] [Accepted: 06/15/2023] [Indexed: 07/29/2023] Open
Abstract
BACKGROUND Social media is an important information source for a growing subset of the population and can likely be leveraged to provide insight into the evolving drug overdose epidemic. Twitter can provide valuable insight into trends, colloquial information available to potential users, and how networks and interactivity might influence what people are exposed to and how they engage in communication around drug use. OBJECTIVE This exploratory study was designed to investigate the ways in which unsupervised machine learning analyses using natural language processing could identify coherent themes for tweets containing substance names. METHODS This study involved harnessing data from Twitter, including large-scale collection of brand name (N=262,607) and street name (N=204,068) prescription drug-related tweets and use of unsupervised machine learning analyses (ie, natural language processing) of collected data with data visualization to identify pertinent tweet themes. Latent Dirichlet allocation (LDA) with coherence score calculations was performed to compare brand (eg, OxyContin) and street (eg, oxys) name tweets. RESULTS We found people discussed drug use differently depending on whether a brand name or street name was used. Brand name categories often contained political talking points (eg, border, crime, and political handling of ongoing drug mitigation strategies). In contrast, categories containing street names occasionally referenced drug misuse, though multiple social uses for a term (eg, Sonata) muddled topic clarity. CONCLUSIONS Content in the brand name corpus reflected discussion about the drug itself and less often reflected personal use. However, content in the street name corpus was notably more diverse and resisted simple LDA categorization. We speculate this may reflect effective use of slang terminology to clandestinely discuss drug-related activity. If so, straightforward analyses of digital drug-related communication may be more difficult than previously assumed. This work has the potential to be used for surveillance and detection of harmful drug use information. It also might be used for appropriate education and dissemination of information to persons engaged in drug use content on Twitter.
Collapse
Affiliation(s)
- Maria A Parker
- Department of Epidemiology and Biostatistics, School of Public Health, Indiana University Bloomington, Bloomington, IN, United States
| | - Danny Valdez
- Department of Applied Health Science, School of Public Health, Indiana University Bloomington, Bloomington, IN, United States
| | - Varun K Rao
- Department of Epidemiology and Biostatistics, School of Public Health, Indiana University Bloomington, Bloomington, IN, United States
- Department of Informatics, Luddy School of Informatics, Computing, and Engineering, Indiana University Bloomington, Bloomington, IN, United States
| | - Katherine S Eddens
- Department of Epidemiology and Biostatistics, School of Public Health, Indiana University Bloomington, Bloomington, IN, United States
| | - Jon Agley
- Department of Applied Health Science, School of Public Health, Indiana University Bloomington, Bloomington, IN, United States
| |
Collapse
|
3
|
Fu R, Kundu A, Mitsakakis N, Elton-Marshall T, Wang W, Hill S, Bondy SJ, Hamilton H, Selby P, Schwartz R, Chaiton MO. Machine learning applications in tobacco research: a scoping review. Tob Control 2023; 32:99-109. [PMID: 34452986 DOI: 10.1136/tobaccocontrol-2020-056438] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Accepted: 04/14/2021] [Indexed: 12/23/2022]
Abstract
OBJECTIVE Identify and review the body of tobacco research literature that self-identified as using machine learning (ML) in the analysis. DATA SOURCES MEDLINE, EMABSE, PubMed, CINAHL Plus, APA PsycINFO and IEEE Xplore databases were searched up to September 2020. Studies were restricted to peer-reviewed, English-language journal articles, dissertations and conference papers comprising an empirical analysis where ML was identified to be the method used to examine human experience of tobacco. Studies of genomics and diagnostic imaging were excluded. STUDY SELECTION Two reviewers independently screened the titles and abstracts. The reference list of articles was also searched. In an iterative process, eligible studies were classified into domains based on their objectives and types of data used in the analysis. DATA EXTRACTION Using data charting forms, two reviewers independently extracted data from all studies. A narrative synthesis method was used to describe findings from each domain such as study design, objective, ML classes/algorithms, knowledge users and the presence of a data sharing statement. Trends of publication were visually depicted. DATA SYNTHESIS 74 studies were grouped into four domains: ML-powered technology to assist smoking cessation (n=22); content analysis of tobacco on social media (n=32); smoker status classification from narrative clinical texts (n=6) and tobacco-related outcome prediction using administrative, survey or clinical trial data (n=14). Implications of these studies and future directions for ML researchers in tobacco control were discussed. CONCLUSIONS ML represents a powerful tool that could advance the research and policy decision-making of tobacco control. Further opportunities should be explored.
Collapse
Affiliation(s)
- Rui Fu
- Institute of Health Policy Management and Evaluation, University of Toronto, Toronto, Ontario, Canada
| | - Anasua Kundu
- Ontario Tobacco Research Unit, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
| | - Nicholas Mitsakakis
- Institute of Health Policy Management and Evaluation, University of Toronto, Toronto, Ontario, Canada
- Children's Hospital of Eastern Ontario Research Institute, Ottawa, Ontario, Canada
| | - Tara Elton-Marshall
- Institute for Mental Health Policy Research, Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| | - Wei Wang
- Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| | - Sean Hill
- Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| | - Susan J Bondy
- Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| | - Hayley Hamilton
- Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| | - Peter Selby
- Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| | - Robert Schwartz
- Ontario Tobacco Research Unit, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
- Institute for Mental Health Policy Research, Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| | - Michael Oliver Chaiton
- Ontario Tobacco Research Unit, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
- Institute for Mental Health Policy Research, Centre for Addiction and Mental Health, Toronto, Ontario, Canada
| |
Collapse
|
4
|
Diet during the COVID-19 pandemic: An analysis of Twitter data. PATTERNS 2022; 3:100547. [PMID: 35721836 PMCID: PMC9197791 DOI: 10.1016/j.patter.2022.100547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 05/19/2022] [Accepted: 06/10/2022] [Indexed: 11/29/2022]
Abstract
In this study, we measured the association between county characteristics and changes in healthy-food, fast-food, and alcohol tweets during the COVID-19 pandemic in the United States. Our analytic dataset consisted of 1,282,316 geotagged tweets that referenced food consumption posted before (63.2%) and during (36.8%) the pandemic and included all US states. We found the share of healthy-food tweets increased by 20.5% during the pandemic compared with pre-pandemic, while fast-food and alcohol tweets decreased by 9.4% and 11.4%, respectively. We also observed that time spent at home and more grocery stores per capita were associated with increased odds of healthy-food tweets and decreased odds of fast-food tweets. More liquor stores per capita was associated with increased odds of alcohol tweets. Our results highlight the potential impact of the pandemic on nutrition and alcohol consumption and the association between the built environment and health behaviors. We used Twitter data to quantify self-reported diet trends during the COVID-19 pandemic Healthy food consumption increased during the pandemic; alcohol consumption decreased Proximity to grocery stores and more time at home were associated with healthier diet Proximity to liquor stores corresponded with increased alcohol consumption
The COVID-19 pandemic upended many aspects of daily life, including how we eat and drink. Restaurant closures and retail restrictions likely impacted individuals’ consumption habits, but longitudinal surveys that monitor nutrition and/or alcohol intake are costly to administer and are prone to response bias. In this study, we use digital trace data from Twitter to track population-level patterns in nutritional intake. Linking geotagged tweets to data measuring US county characteristics and built environment, this study finds that increased time at home and access to grocery stores during the pandemic may have promoted healthy-food consumption. This study also suggests that access to alcohol retail establishments may have led to more drinking. These findings validate the importance of the built environment to health behaviors while highlighting how social media data may be used to assess the impact of public health crises.
Collapse
|
5
|
Zhou H, Jia H, Lei G, Zhou T, Wu J, Chang Y, Wang L, Sheng M, Yang X. Quantitative assessment of normal hip cartilage in children under 9 years old by T2 mapping. MAGMA (NEW YORK, N.Y.) 2022; 35:459-466. [PMID: 34652541 DOI: 10.1007/s10334-021-00962-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 09/19/2021] [Accepted: 09/21/2021] [Indexed: 06/13/2023]
Abstract
OBJECTIVE To investigate the variation in T2 at different zones of normal hip cartilage in children and the relationship between T2 value and age. MATERIALS AND METHODS Nineteen children with 30 normal hip joints were evaluated with a coronal T2 mapping sequence at a 3-Tesla MRI system. The femoral cartilage and acetabular cartilage were firstly segmented by mask-based interactive method and then equally divided into eight and six radial sections, respectively. Moreover, each radial section was further divided into two layers referring to the superficial and deep halves of the corresponding cartilage. Cartilage T2 of these sections and layers were measured and subsequently analyzed. RESULTS There was a negative correlation between the T2 values in the hip cartilage and the age of children (rs < - 0.6, P1 < 0.05). Articular cartilage T2 increased at angles close to the magic angle (54.7°). Femoral cartilage and acetabular cartilage had a relatively shorter T2 in the radial sections near the vertex of the femoral head. The T2 values in superficial layers of both cartilages were significantly higher than those in deep layers (P < 0.05). CONCLUSION The T2 value decreases as the cartilage developing into a more mature state. Cartilage T2 values in the weight-bearing areas are relatively low due to an increase of collagen density and the loss of interstitial water. The restriction of the water molecules by solid components in the deeper layer of cartilage may decrease the T2 values.
Collapse
Affiliation(s)
- Hongyan Zhou
- School of Medical Imaging, Xuzhou Medical University, Xuzhou, 221004, China
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Huihui Jia
- Department of Radiology, Children's Hospital of Soochow University, Suzhou, 215025, China
| | - Gege Lei
- School of Electronic Engineering and Optoelectronic Technology, Nanjing University of Science and Technology, Nanjing, 210094, China
| | - Tianli Zhou
- School of Medical Imaging, Xuzhou Medical University, Xuzhou, 221004, China
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Jizhi Wu
- Department of Radiology, Children's Hospital of Soochow University, Suzhou, 215025, China
| | - Yan Chang
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Lei Wang
- School of Ophthalmology and Optometry, Eye Hospital of Wenzhou Medical University, Wenzhou, 325027, China
| | - Mao Sheng
- Department of Radiology, Children's Hospital of Soochow University, Suzhou, 215025, China.
| | - Xiaodong Yang
- School of Medical Imaging, Xuzhou Medical University, Xuzhou, 221004, China.
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China.
| |
Collapse
|
6
|
Golder S, Stevens R, O'Connor K, James R, Gonzalez-Hernandez G. Methods to Establish Race or Ethnicity of Twitter Users: Scoping Review. J Med Internet Res 2022; 24:e35788. [PMID: 35486433 PMCID: PMC9107046 DOI: 10.2196/35788] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 03/08/2022] [Accepted: 03/23/2022] [Indexed: 11/13/2022] Open
Abstract
Background A growing amount of health research uses social media data. Those critical of social media research often cite that it may be unrepresentative of the population; however, the suitability of social media data in digital epidemiology is more nuanced. Identifying the demographics of social media users can help establish representativeness. Objective This study aims to identify the different approaches or combination of approaches to extract race or ethnicity from social media and report on the challenges of using these methods. Methods We present a scoping review to identify methods used to extract the race or ethnicity of Twitter users from Twitter data sets. We searched 17 electronic databases from the date of inception to May 15, 2021, and carried out reference checking and hand searching to identify relevant studies. Sifting of each record was performed independently by at least two researchers, with any disagreement discussed. Studies were required to extract the race or ethnicity of Twitter users using either manual or computational methods or a combination of both. Results Of the 1249 records sifted, we identified 67 (5.36%) that met our inclusion criteria. Most studies (51/67, 76%) have focused on US-based users and English language tweets (52/67, 78%). A range of data was used, including Twitter profile metadata, such as names, pictures, information from bios (including self-declarations), or location or content of the tweets. A range of methodologies was used, including manual inference, linkage to census data, commercial software, language or dialect recognition, or machine learning or natural language processing. However, not all studies have evaluated these methods. Those that evaluated these methods found accuracy to vary from 45% to 93% with significantly lower accuracy in identifying categories of people of color. The inference of race or ethnicity raises important ethical questions, which can be exacerbated by the data and methods used. The comparative accuracies of the different methods are also largely unknown. Conclusions There is no standard accepted approach or current guidelines for extracting or inferring the race or ethnicity of Twitter users. Social media researchers must carefully interpret race or ethnicity and not overpromise what can be achieved, as even manual screening is a subjective, imperfect method. Future research should establish the accuracy of methods to inform evidence-based best practice guidelines for social media researchers and be guided by concerns of equity and social justice.
Collapse
Affiliation(s)
- Su Golder
- Department of Health Sciences, University of York, York, United Kingdom
| | - Robin Stevens
- School of Communication and Journalism, University of Southern California, Los Angeles, CA, United States
| | - Karen O'Connor
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Richard James
- School of Nursing Liaison and Clinical Outreach Coordinator, University of Pennsylvania, Philadelphia, PA, United States
| | - Graciela Gonzalez-Hernandez
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
7
|
Jiang L, Huang Y, Cheng H, Zhang T, Huang L. Emergency Response and Risk Communication Effects of Local Media during COVID-19 Pandemic in China: A Study Based on a Social Media Network. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:10942. [PMID: 34682685 PMCID: PMC8535417 DOI: 10.3390/ijerph182010942] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 10/13/2021] [Accepted: 10/13/2021] [Indexed: 01/23/2023]
Abstract
As the country where the COVID-19 was first reported and initially broke out, China has controlled the spread of the pandemic well. The pandemic prevention process included emergency response and risk communication, both of which could notably increase public participation, people's anxiety has been alleviated, their confidence in the government has been enhanced, and the implementation of prevention and control measures has been understood. This study selected 157,283 articles published by 447 accounts across 326 cities in February 2020 from WeChat, the largest social media application in China, to systematically compare the spatial distributions in the effectiveness of emergency responses and risk communication. The results showed that there were significant regional differences in the effectiveness of emergency response and risk communication during the pandemic period in China. The effectiveness of emergency response and risk communication are related to the exposure risk to the COVID-19, the level of economy, culture, and education of the region, the type of accounts and articles, and the ranking of the articles in posts. The timeliness and distribution types of articles should take into account the psychological changes in communication recipients to avoid the dissemination of homogenized information to the masses and the resulting information receiving fatigue period.
Collapse
Affiliation(s)
| | | | | | | | - Lei Huang
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing 210023, China; (L.J.); (Y.H.); (H.C.); (T.Z.)
| |
Collapse
|
8
|
Kasson E, Singh AK, Huang M, Wu D, Cavazos-Rehg P. Using a mixed methods approach to identify public perception of vaping risks and overall health outcomes on Twitter during the 2019 EVALI outbreak. Int J Med Inform 2021; 155:104574. [PMID: 34592539 DOI: 10.1016/j.ijmedinf.2021.104574] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 08/30/2021] [Accepted: 09/10/2021] [Indexed: 12/24/2022]
Abstract
INTRODUCTION Vaping product use (i.e., e-cigarettes) has been rising since 2000 in the United States. Negative health outcomes associated with vaping products have created public uncertainty and debates on social media platforms. This study explores the feasibility of using social media as a surveillance tool to identify relevant posts and at-risk vaping users. METHODS Using an interdisciplinary method that leverages natural language processing and manual content analysis, we extracted and analyzed 794,620 vaping-related tweets on Twitter. After observing significant increases in vaping-related tweets in July, August, and September 2019, additional human coding was completed on a subset of these tweets to better understand primary themes of vaping-related discussions on Twitter during this time frame. RESULTS We found significant increases in tweets related to negative health outcomes such as acute lung injury and respiratory issues during the outbreak of e-cigarette/vaping associated lung injury (EVALI) in the fall of 2019. Positive sentiment toward vaping remained high, even across the peak of this outbreak in July, August, and September. Tweets mentioning the public perceptions of youth risk were concerning, as were increases in marketing and marijuana-related tweets during this time. DISCUSSION The preliminary results of this study suggest the feasibility of using Twitter as a means of surveillance for public health crises, and themes found in this research could aid in specifying those groups or populations at risk on Twitter. As such, we plan to build automatic detection algorithms to identify these unique vaping users to connect them with a digital intervention in the future.
Collapse
Affiliation(s)
- Erin Kasson
- Department of Psychiatry, Washington University School of Medicine, 660 S Euclid Ave, St. Louis, MO, USA
| | - Avineet Kumar Singh
- Department of Computer Science and Engineering, University of South Carolina, Columbia, SC, USA
| | - Ming Huang
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN, USA
| | - Dezhi Wu
- Department of Integrated Information Technology, University of South Carolina, Columbia, SC, USA.
| | - Patricia Cavazos-Rehg
- Department of Psychiatry, Washington University School of Medicine, 660 S Euclid Ave, St. Louis, MO, USA
| |
Collapse
|
9
|
Kino S, Hsu YT, Shiba K, Chien YS, Mita C, Kawachi I, Daoud A. A scoping review on the use of machine learning in research on social determinants of health: Trends and research prospects. SSM Popul Health 2021; 15:100836. [PMID: 34169138 PMCID: PMC8207228 DOI: 10.1016/j.ssmph.2021.100836] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Revised: 05/15/2021] [Accepted: 06/01/2021] [Indexed: 02/08/2023] Open
Abstract
Background Machine learning (ML) has spread rapidly from computer science to several disciplines. Given the predictive capacity of ML, it offers new opportunities for health, behavioral, and social scientists. However, it remains unclear how and to what extent ML is being used in studies of social determinants of health (SDH). Methods Using four search engines, we conducted a scoping review of studies that used ML to study SDH (published before May 1, 2020). Two independent reviewers analyzed the relevant studies. For each study, we identified the research questions, Results, data, and algorithms. We synthesized our findings in a narrative report. Results Of the initial 8097 hits, we identified 82 relevant studies. The number of publications has risen during the past decade. More than half of the studies (n = 46) used US data. About 80% (n = 66) utilized surveys, and 70% (n = 57) employed ML for common prediction tasks. Although the number of studies in ML and SDH is growing rapidly, only a few studies used ML to improve causal inference, curate data, or identify social bias in predictions (i.e., algorithmic fairness). Conclusions While ML equips researchers with new ways to measure health outcomes and their determinants from non-conventional sources such as text, audio, and image data, most studies still rely on traditional surveys. Although there are no guarantees that ML will lead to better social epidemiological research, the potential for innovation in SDH research is evident as a result of harnessing the predictive power of ML for causality, data curation, or algorithmic fairness.
Collapse
Affiliation(s)
- Shiho Kino
- Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, MA, USA.,Department of Social Epidemiology, Kyoto University, Kyoto, Japan
| | - Yu-Tien Hsu
- Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Koichiro Shiba
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Yung-Shin Chien
- Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Carol Mita
- Countway Library of Medicine, Harvard University, Boston, MA, USA
| | - Ichiro Kawachi
- Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Adel Daoud
- Center for Population and Development Studies, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, USA.,Department of Sociology and Work Science, University of Gothenburg, Sweden.,The Division of Data Science and Artificial Intelligence of the Department of Computer Science and Engineering, Chalmers University of Technology, Sweden.,Institute for Analytical Sociology, Linköping University, Sweden
| |
Collapse
|
10
|
Singh T, Roberts K, Cohen T, Cobb N, Wang J, Fujimoto K, Myneni S. Social Media as a Research Tool (SMaaRT) for Risky Behavior Analytics: Methodological Review. JMIR Public Health Surveill 2020; 6:e21660. [PMID: 33252345 PMCID: PMC7735906 DOI: 10.2196/21660] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2020] [Revised: 10/05/2020] [Accepted: 11/06/2020] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Modifiable risky health behaviors, such as tobacco use, excessive alcohol use, being overweight, lack of physical activity, and unhealthy eating habits, are some of the major factors for developing chronic health conditions. Social media platforms have become indispensable means of communication in the digital era. They provide an opportunity for individuals to express themselves, as well as share their health-related concerns with peers and health care providers, with respect to risky behaviors. Such peer interactions can be utilized as valuable data sources to better understand inter-and intrapersonal psychosocial mediators and the mechanisms of social influence that drive behavior change. OBJECTIVE The objective of this review is to summarize computational and quantitative techniques facilitating the analysis of data generated through peer interactions pertaining to risky health behaviors on social media platforms. METHODS We performed a systematic review of the literature in September 2020 by searching three databases-PubMed, Web of Science, and Scopus-using relevant keywords, such as "social media," "online health communities," "machine learning," "data mining," etc. The reporting of the studies was directed by the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. Two reviewers independently assessed the eligibility of studies based on the inclusion and exclusion criteria. We extracted the required information from the selected studies. RESULTS The initial search returned a total of 1554 studies, and after careful analysis of titles, abstracts, and full texts, a total of 64 studies were included in this review. We extracted the following key characteristics from all of the studies: social media platform used for conducting the study, risky health behavior studied, the number of posts analyzed, study focus, key methodological functions and tools used for data analysis, evaluation metrics used, and summary of the key findings. The most commonly used social media platform was Twitter, followed by Facebook, QuitNet, and Reddit. The most commonly studied risky health behavior was nicotine use, followed by drug or substance abuse and alcohol use. Various supervised and unsupervised machine learning approaches were used for analyzing textual data generated from online peer interactions. Few studies utilized deep learning methods for analyzing textual data as well as image or video data. Social network analysis was also performed, as reported in some studies. CONCLUSIONS Our review consolidates the methodological underpinnings for analyzing risky health behaviors and has enhanced our understanding of how social media can be leveraged for nuanced behavioral modeling and representation. The knowledge gained from our review can serve as a foundational component for the development of persuasive health communication and effective behavior modification technologies aimed at the individual and population levels.
Collapse
Affiliation(s)
- Tavleen Singh
- School of Biomedical Informatics, The University of Texas Health Science Center, Houston, TX, United States
| | - Kirk Roberts
- School of Biomedical Informatics, The University of Texas Health Science Center, Houston, TX, United States
| | - Trevor Cohen
- Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, United States
| | - Nathan Cobb
- Georgetown University Medical Center, Washington, DC, United States
| | - Jing Wang
- School of Nursing, The University of Texas Health Science Center, San Antonio, TX, United States
| | - Kayo Fujimoto
- School of Public Health, The University of Texas Health Science Center, Houston, TX, United States
| | - Sahiti Myneni
- School of Biomedical Informatics, The University of Texas Health Science Center, Houston, TX, United States
| |
Collapse
|
11
|
Hockenhull J, Black JC, Bletz A, Margolin Z, Olson R, Wood DM, Dart RC, Dargan PI. An evaluation of online discussion relating to nonmedical use of prescription opioids within the UK. Br J Clin Pharmacol 2020; 87:1637-1646. [PMID: 33464643 DOI: 10.1111/bcp.14603] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 08/21/2020] [Accepted: 09/30/2020] [Indexed: 01/06/2023] Open
Abstract
AIM To identify and describe the nature of online discussion relating to prescription opioids within the UK. METHODS We performed analysis of posts originating in the UK related to buprenorphine, hydrocodone, oxycodone and tramadol using Social Studio, a web-monitoring platform. The study included posts published between January 2014 and December 2016. The data were cleaned to produce a final dataset consisting only of substantive mentions, which were then categorised by defined themes. RESULTS The final dataset included a total of 17 361 substantive mentions (2936 buprenorphine, 2894 hydrocodone, 3826 oxycodone and 7705 tramadol). The most common theme for all 4 drugs was sharing experience or opinion comprising over 90% of mentions for each drug, while discussion related to polysubstance use was present in >1/4 of mentions across drug substances. Mentions related to diversion were more common for hydrocodone and oxycodone (8.1% [6.3-10.1 95% confidence interval] and 7.8% [6.5-9.2], respectively) than buprenorphine or tramadol (4.1 and 3.9% [3.5-4.3], respectively). CONCLUSION This investigation shows that there is substantial online discussion relating to a variety of nonmedical use (NMU) behaviours of prescription opioids within the UK, including for hydrocodone, which is not medically available. Web monitoring provides useful data and merits future investigation; this could include expansion to other categories of drugs and a more in-depth analysis of motivations behind NMU, both of which could add timely evidence regarding the current situation in the UK and help inform public health interventions for NMU of prescription drugs.
Collapse
Affiliation(s)
- Joanna Hockenhull
- Clinical Toxicology, Guy's and St Thomas' NHS Foundation Trust, London, UK
| | - Joshua C Black
- Rocky Mountain Poison & Drug Safety, Denver Health and Hospital Authorty, Denver, CO, USA
| | - Alex Bletz
- Rocky Mountain Poison & Drug Safety, Denver Health and Hospital Authorty, Denver, CO, USA
| | - Zachary Margolin
- Rocky Mountain Poison & Drug Safety, Denver Health and Hospital Authorty, Denver, CO, USA
| | - Rick Olson
- Rocky Mountain Poison & Drug Safety, Denver Health and Hospital Authorty, Denver, CO, USA
| | - David M Wood
- Clinical Toxicology, Guy's and St Thomas' NHS Foundation Trust, London, UK.,Faculty of Life Sciences and Medicine, King's College London, London, UK
| | - Richard C Dart
- Rocky Mountain Poison & Drug Safety, Denver Health and Hospital Authorty, Denver, CO, USA
| | - Paul I Dargan
- Clinical Toxicology, Guy's and St Thomas' NHS Foundation Trust, London, UK.,Faculty of Life Sciences and Medicine, King's College London, London, UK
| |
Collapse
|
12
|
van Draanen J, Tao H, Gupta S, Liu S. Geographic Differences in Cannabis Conversations on Twitter: Infodemiology Study. JMIR Public Health Surveill 2020; 6:e18540. [PMID: 33016888 PMCID: PMC7573699 DOI: 10.2196/18540] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2020] [Revised: 08/28/2020] [Accepted: 08/31/2020] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Infodemiology is an emerging field of research that utilizes user-generated health-related content, such as that found in social media, to help improve public health. Twitter has become an important venue for studying emerging patterns in health issues such as substance use because it can reflect trends in real-time and display messages generated directly by users, giving a uniquely personal voice to analyses. Over the past year, several states in the United States have passed legislation to legalize adult recreational use of cannabis and the federal government in Canada has done the same. There are few studies that examine the sentiment and content of tweets about cannabis since the recent legislative changes regarding cannabis have occurred in North America. OBJECTIVE To examine differences in the sentiment and content of cannabis-related tweets by state cannabis laws, and to examine differences in sentiment between the United States and Canada between 2017 and 2019. METHODS In total, 1,200,127 cannabis-related tweets were collected from January 1, 2017, to June 17, 2019, using the Twitter application programming interface. Tweets then were grouped geographically based on cannabis legal status (legal for adult recreational use, legal for medical use, and no legal use) in the locations from which the tweets came. Sentiment scoring for the tweets was done with VADER (Valence Aware Dictionary and sEntiment Reasoner), and differences in sentiment for states with different cannabis laws were tested using Tukey adjusted two-sided pairwise comparisons. Topic analysis to determine the content of tweets was done using latent Dirichlet allocation in Python, using a Java implementation, LdaMallet, with Gensim wrapper. RESULTS Significant differences were seen in tweet sentiment between US states with different cannabis laws (P=.001 for negative sentiment tweets in fully illegal compared to legal for adult recreational use states), as well as between the United States and Canada (P=.003 for positive sentiment and P=.001 for negative sentiment). In both cases, restrictive state policy environments (eg, those where cannabis use is fully illegal, or legal for medical use only) were associated with more negative tweet sentiment than less restrictive policy environments (eg, where cannabis is legal for adult recreational use). Six key topics were found in recent US tweet contents: fun and recreation (keywords, eg, love, life, high); daily life (today, start, live); transactions (buy, sell, money); places of use (room, car, house); medical use and cannabis industry (business, industry, company); and legalization (legalize, police, tax). The keywords representing content of tweets also differed between the United States and Canada. CONCLUSIONS Knowledge about how cannabis is being discussed online, and geographic differences that exist in these conversations may help to inform public health planning and prevention efforts. Public health education about how to use cannabis in ways that promote safety and minimize harms may be especially important in places where cannabis is legal for adult recreational and medical use.
Collapse
Affiliation(s)
- Jenna van Draanen
- Department of Sociology, University of British Columbia, Vancouver, BC, Canada
| | - HaoDong Tao
- Department of Computer Science, University of Victoria, Victoria, BC, Canada
| | - Saksham Gupta
- School of Exercise Science, Physical & Health Education, University of Victoria, Victoria, BC, Canada
| | - Sam Liu
- School of Exercise Science, Physical & Health Education, University of Victoria, Victoria, BC, Canada
| |
Collapse
|
13
|
Cai M, Shah N, Li J, Chen WH, Cuomo RE, Obradovich N, Mackey TK. Identification and characterization of tweets related to the 2015 Indiana HIV outbreak: A retrospective infoveillance study. PLoS One 2020; 15:e0235150. [PMID: 32845882 PMCID: PMC7449407 DOI: 10.1371/journal.pone.0235150] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 04/20/2020] [Indexed: 11/29/2022] Open
Abstract
INTRODUCTION From late 2014 through 2015, Scott County, Indiana faced an HIV outbreak triggered by opioid abuse and transition to injection drug use. Investigating the origins, risk factors, and responses related to this outbreak is critical to inform future surveillance, interventions, and policymaking. In response, this retrospective infoveillance study identifies and characterizes user-generated messages related to opioid abuse, heroin injection drug use, and HIV status using natural language processing (NLP) among Twitter users in Indiana during the period of this HIV outbreak. MATERIALS AND METHODS Our study consisted of two phases: data collection and processing, and data analysis. We collected Indiana geolocated tweets from the public Twitter API using Amazon Web Services EC2 instances filtered for geocoded messages in the immediate pre and post period of the outbreak. In the data analysis phase we applied an unsupervised machine learning approach using NLP called the Biterm Topic Model (BTM) to identify tweets related to opioid, heroin/injection, and HIV behavior and then examined these messages for HIV risk-related topics that could be associated with the outbreak. RESULTS More than 10 million geocoded tweets occurring in Indiana during the immediate pre and post period of the outbreak were collected for analysis. Using BTM, we identified 1350 tweets thought to be relevant to the outbreak and then confirmed 358 tweets using human annotation. The most prevalent themes identified were tweets related to self-reported abuse of illicit and prescription drugs, opioid use disorder, self-reported HIV status, and public sentiment regarding the outbreak. Geospatial analysis found that these messages clustered in population dense areas outside of the outbreak, including Indianapolis and neighboring Clark County. DISCUSSION This infoveillance study characterized the social media conversations of communities in Indiana in the pre and post period of the 2015 HIV outbreak. Behavioral themes detected reflect discussion about risk factors related to HIV transmission stemming from opioid and heroin abuse for priority populations, and also help identify community attitudes that could have motivated or detracted the use of HIV prevention methods, along with helping identify factors that can impede access to prevention services. CONCLUSIONS Infoveillance approaches, such as the analysis conducted in this study, represent a possibly strategy to detect "signal" of the emergence of risk factors associated with an outbreak though may be limited in their scope and generalizability. Our results, in conjunction with other forms of public health surveillance, can leverage the growing ubiquity of social media platforms to better detect opioid-related HIV risk knowledge, attitudes and behavior, as well as inform future prevention efforts.
Collapse
Affiliation(s)
- Mingxiang Cai
- Global Health Policy Institute, San Diego, CA, United States of America
- Department of Healthcare Research and Policy, University of California, San Diego, CA, United States of America
- Department of Computer Science and Engineering, University of California, San Diego, CA, United States of America
| | - Neal Shah
- Global Health Policy Institute, San Diego, CA, United States of America
- Department of Healthcare Research and Policy, University of California, San Diego, CA, United States of America
| | - Jiawei Li
- Global Health Policy Institute, San Diego, CA, United States of America
- Department of Healthcare Research and Policy, University of California, San Diego, CA, United States of America
- Department of Computational Science, Mathematics and Engineering, University of California, San Diego, CA, United States of America
| | - Wen-Hao Chen
- Department of Healthcare Research and Policy, University of California, San Diego, CA, United States of America
- Department of Computer Science and Engineering, University of California, San Diego, CA, United States of America
| | - Raphael E. Cuomo
- Global Health Policy Institute, San Diego, CA, United States of America
- Department of Anesthesiology, San Diego School of Medicine, University of California, San Diego, CA, United States of America
| | | | - Tim K. Mackey
- Global Health Policy Institute, San Diego, CA, United States of America
- Department of Healthcare Research and Policy, University of California, San Diego, CA, United States of America
- Department of Anesthesiology, San Diego School of Medicine, University of California, San Diego, CA, United States of America
- Division of Infections Disease and Global Public Health, Department of Medicine, San Diego School of Medicine, University of California, San Diego, CA, United States of America
| |
Collapse
|
14
|
Atique S, Bautista JR, Block LJ, Lee JJ, Lozada-Perezmitre E, Nibber R, O'Connor S, Peltonen LM, Ronquillo C, Tayaben J, Thilo FJS, Topaz M. A nursing informatics response to COVID-19: Perspectives from five regions of the world. J Adv Nurs 2020; 76:2462-2468. [PMID: 32420652 PMCID: PMC7276900 DOI: 10.1111/jan.14417] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Accepted: 05/12/2020] [Indexed: 01/28/2023]
Affiliation(s)
- Suleman Atique
- Department of Health Informatics, College of Public Health and Health Informatics, University of Ha'il, Ha'il, Saudi Arabia
| | - John R Bautista
- School of Information, The University of Texas at Austin, Austin, USA
| | - Lorraine J Block
- School of Nursing, University of British Columbia, Vancouver, Canada
| | - Jung Jae Lee
- School of Nursing, The University of Hong Kong, Pokfulam, Hong Kong
| | | | - Raji Nibber
- School of Nursing, University of British Columbia, Vancouver, Canada
| | - Siobhan O'Connor
- School of Health in Social Science, The University of Edinburgh, Edinburgh, United Kingdom
| | | | | | - Jude Tayaben
- College of Nursing, Benguet State University, Benguet, Philippines
| | - Friederike J S Thilo
- Applied Research and Development in Nursing, Department of Health Professions, Bern University of Applied Sciences, Bern, Switzerland
| | - Maxim Topaz
- School of Nursing and Data Science Institute, Columbia University, New York, USA
| |
Collapse
|
15
|
Cao Y, Stewart K, Factor J, Billing A, Massey E, Artigiani E, Wagner M, Dezman Z, Wish E. Using socially-sensed data to infer ZIP level characteristics for the spatiotemporal analysis of drug-related health problems in Maryland. Health Place 2020; 63:102345. [PMID: 32543431 DOI: 10.1016/j.healthplace.2020.102345] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Revised: 04/02/2020] [Accepted: 04/14/2020] [Indexed: 01/07/2023]
Abstract
This research investigated how socially sensed data can be used to detect ZIP level characteristics that are associated with spatial and temporal patterns of Emergency Department patients with a chief complaint and/or diagnosis of overdose or drug-related health problems for four hospitals in Baltimore and Anne Arundel County, MD during 2016-2018. Dynamic characteristics were identified using socially-sensed data (i.e., geo-tagged Twitter data) at ZIP code level over varying temporal resolutions. Data about three place-based variables including comments and concerns about crime, drug use, and negative or depressed sentiments, were extracted from tweets, along with data from four socio-environmental variables from the American Community Survey were collected to explore socio-environmental characteristics during the same period. Our study showed a statistically significant increase in adjusted rates of Emergency Department (ED) visits occurred between June and November 2017 for patients residing in ZIP codes in western Baltimore and northeastern Anne Arundel County. During this period, the three topics extracted from Twitter data were highly correlated with the ZIP codes where the patients were residing. Exploring the dynamic spatial associations between socio-environmental variables and ED visits for acute overdose assists local health officials in optimizing interventions for vulnerable locations.
Collapse
Affiliation(s)
- Yanjia Cao
- Division of Infectious Diseases and Geographic Medicine, Stanford University School of Medicine, Stanford, CA, USA.
| | - Kathleen Stewart
- Center for Geospatial Information Science, Department of Geographical Sciences, University of Maryland, College Park, MD, USA
| | - Julie Factor
- Center for Substance Abuse Research, University of Maryland, College Park, MD, USA
| | - Amy Billing
- Center for Substance Abuse Research, University of Maryland, College Park, MD, USA
| | - Ebonie Massey
- Center for Substance Abuse Research, University of Maryland, College Park, MD, USA
| | - Eleanor Artigiani
- Center for Substance Abuse Research, University of Maryland, College Park, MD, USA
| | - Michael Wagner
- Center for Substance Abuse Research, University of Maryland, College Park, MD, USA
| | - Zachary Dezman
- Department of Emergency Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Eric Wish
- Center for Substance Abuse Research, University of Maryland, College Park, MD, USA
| |
Collapse
|
16
|
Kim MG, Kim J, Kim SC, Jeong J. Twitter Analysis of the Nonmedical Use and Side Effects of Methylphenidate: Machine Learning Study. J Med Internet Res 2020; 22:e16466. [PMID: 32130160 PMCID: PMC7063527 DOI: 10.2196/16466] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2019] [Revised: 01/08/2020] [Accepted: 01/27/2020] [Indexed: 01/20/2023] Open
Abstract
Background Methylphenidate, a stimulant used to treat attention deficit hyperactivity disorder, has the potential to be used nonmedically, such as for studying and recreation. In an era when many people actively use social networking services, experience with the nonmedical use or side effects of methylphenidate might be shared on Twitter. Objective The purpose of this study was to analyze tweets about the nonmedical use and side effects of methylphenidate using a machine learning approach. Methods A total of 34,293 tweets mentioning methylphenidate from August 2018 to July 2019 were collected using searches for “methylphenidate” and its brand names. Tweets in a randomly selected training dataset (6860/34,293, 20.00%) were annotated as positive or negative for two dependent variables: nonmedical use and side effects. Features such as personal noun, nonmedical use terms, medical use terms, side effect terms, sentiment scores, and the presence of a URL were generated for supervised learning. Using the labeled training dataset and features, support vector machine (SVM) classifiers were built and the performance was evaluated using F1 scores. The classifiers were applied to the test dataset to determine the number of tweets about nonmedical use and side effects. Results Of the 6860 tweets in the training dataset, 5.19% (356/6860) and 5.52% (379/6860) were about nonmedical use and side effects, respectively. Performance of SVM classifiers for nonmedical use and side effects, expressed as F1 scores, were 0.547 (precision: 0.926, recall: 0.388, and accuracy: 0.967) and 0.733 (precision: 0.920, recall: 0.609, and accuracy: 0.976), respectively. In the test dataset, the SVM classifiers identified 361 tweets (1.32%) about nonmedical use and 519 tweets (1.89%) about side effects. The proportion of tweets about nonmedical use was highest in May 2019 (46/2624, 1.75%) and December 2018 (36/2041, 1.76%). Conclusions The SVM classifiers that were built in this study were highly precise and accurate and will help to automatically identify the nonmedical use and side effects of methylphenidate using Twitter.
Collapse
Affiliation(s)
- Myeong Gyu Kim
- Graduate School of Clinical Pharmacy, CHA University, Pocheon, Republic of Korea
| | - Jungu Kim
- Graduate School of Clinical Pharmacy, CHA University, Pocheon, Republic of Korea
| | - Su Cheol Kim
- Department of Psychiatry, Anam Hospital, Seoul, Republic of Korea
| | - Jaegwon Jeong
- Department of Psychiatry, Anam Hospital, Seoul, Republic of Korea
| |
Collapse
|
17
|
Hu H, Phan N, Chun SA, Geller J, Vo H, Ye X, Jin R, Ding K, Kenne D, Dou D. An insight analysis and detection of drug-abuse risk behavior on Twitter with self-taught deep learning. COMPUTATIONAL SOCIAL NETWORKS 2019. [DOI: 10.1186/s40649-019-0071-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Abstract
Drug abuse continues to accelerate towards becoming the most severe public health problem in the United States. The ability to detect drug-abuse risk behavior at a population scale, such as among the population of Twitter users, can help us to monitor the trend of drug-abuse incidents. Unfortunately, traditional methods do not effectively detect drug-abuse risk behavior, given tweets. This is because: (1) tweets usually are noisy and sparse and (2) the availability of labeled data is limited. To address these challenging problems, we propose a deep self-taught learning system to detect and monitor drug-abuse risk behaviors in the Twitter sphere, by leveraging a large amount of unlabeled data. Our models automatically augment annotated data: (i) to improve the classification performance and (ii) to capture the evolving picture of drug abuse on online social media. Our extensive experiments have been conducted on three million drug-abuse-related tweets with geo-location information. Results show that our approach is highly effective in detecting drug-abuse risk behaviors.
Collapse
|
18
|
Klein AZ, Sarker A, Cai H, Weissenbacher D, Gonzalez-Hernandez G. Social media mining for birth defects research: A rule-based, bootstrapping approach to collecting data for rare health-related events on Twitter. J Biomed Inform 2018; 87:68-78. [PMID: 30292855 PMCID: PMC6295660 DOI: 10.1016/j.jbi.2018.10.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2018] [Revised: 09/26/2018] [Accepted: 10/03/2018] [Indexed: 10/28/2022]
Abstract
BACKGROUND Although birth defects are the leading cause of infant mortality in the United States, methods for observing human pregnancies with birth defect outcomes are limited. OBJECTIVE The primary objectives of this study were (i) to assess whether rare health-related events-in this case, birth defects-are reported on social media, (ii) to design and deploy a natural language processing (NLP) approach for collecting such sparse data from social media, and (iii) to utilize the collected data to discover a cohort of women whose pregnancies with birth defect outcomes could be observed on social media for epidemiological analysis. METHODS To assess whether birth defects are mentioned on social media, we mined 432 million tweets posted by 112,647 users who were automatically identified via their public announcements of pregnancy on Twitter. To retrieve tweets that mention birth defects, we developed a rule-based, bootstrapping approach, which relies on a lexicon, lexical variants generated from the lexicon entries, regular expressions, post-processing, and manual analysis guided by distributional properties. To identify users whose pregnancies with birth defect outcomes could be observed for epidemiological analysis, inclusion criteria were (i) tweets indicating that the user's child has a birth defect, and (ii) accessibility to the user's tweets during pregnancy. We conducted a semi-automatic evaluation to estimate the recall of the tweet-collection approach, and performed a preliminary assessment of the prevalence of selected birth defects among the pregnancy cohort derived from Twitter. RESULTS We manually annotated 16,822 retrieved tweets, distinguishing tweets indicating that the user's child has a birth defect (true positives) from tweets that merely mention birth defects (false positives). Inter-annotator agreement was substantial: κ = 0.79 (Cohen's kappa). Analyzing the timelines of the 646 users whose tweets were true positives resulted in the discovery of 195 users that met the inclusion criteria. Congenital heart defects are the most common type of birth defect reported on Twitter, consistent with findings in the general population. Based on an evaluation of 4169 tweets retrieved using alternative text mining methods, the recall of the tweet-collection approach was 0.95. CONCLUSIONS Our contributions include (i) evidence that rare health-related events are indeed reported on Twitter, (ii) a generalizable, systematic NLP approach for collecting sparse tweets, (iii) a semi-automatic method to identify undetected tweets (false negatives), and (iv) a collection of publicly available tweets by pregnant users with birth defect outcomes, which could be used for future epidemiological analysis. In future work, the annotated tweets could be used to train machine learning algorithms to automatically identify users reporting birth defect outcomes, enabling the large-scale use of social media mining as a complementary method for such epidemiological research.
Collapse
Affiliation(s)
- Ari Z Klein
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States.
| | - Abeed Sarker
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States.
| | - Haitao Cai
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States.
| | - Davy Weissenbacher
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States.
| | - Graciela Gonzalez-Hernandez
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States.
| |
Collapse
|