1
|
Walsh J, Cave J, Griffiths F. Combining Topic Modeling, Sentiment Analysis, and Corpus Linguistics to Analyze Unstructured Web-Based Patient Experience Data: Case Study of Modafinil Experiences. J Med Internet Res 2024; 26:e54321. [PMID: 39662896 DOI: 10.2196/54321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 06/19/2024] [Accepted: 09/27/2024] [Indexed: 12/13/2024] Open
Abstract
BACKGROUND Patient experience data from social media offer patient-centered perspectives on disease, treatments, and health service delivery. Current guidelines typically rely on systematic reviews, while qualitative health studies are often seen as anecdotal and nongeneralizable. This study explores combining personal health experiences from multiple sources to create generalizable evidence. OBJECTIVE The study aims to (1) investigate how combining unsupervised natural language processing (NLP) and corpus linguistics can explore patient perspectives from a large unstructured dataset of modafinil experiences, (2) compare findings with Cochrane meta-analyses on modafinil's effectiveness, and (3) develop a methodology for analyzing such data. METHODS Using 69,022 posts from 790 sources, we used a variety of NLP and corpus techniques to analyze the data, including data cleaning techniques to maximize post context, Python for NLP techniques, and Sketch Engine for linguistic analysis. We used multiple topic mining approaches, such as latent Dirichlet allocation, nonnegative matrix factorization, and word-embedding methods. Sentiment analysis used TextBlob and Valence Aware Dictionary and Sentiment Reasoner, while corpus methods including collocation, concordance, and n-gram generation. Previous work had mapped topic mining to themes, such as health conditions, reasons for taking modafinil, symptom impacts, dosage, side effects, effectiveness, and treatment comparisons. RESULTS Key findings of the study included modafinil use across 166 health conditions, most frequently narcolepsy, multiple sclerosis, attention-deficit disorder, anxiety, sleep apnea, depression, bipolar disorder, chronic fatigue syndrome, fibromyalgia, and chronic disease. Word-embedding topic modeling mapped 70% of posts to predefined themes, while sentiment analysis revealed 65% positive responses, 6% neutral responses, and 28% negative responses. Notably, the perceived effectiveness of modafinil for various conditions strongly contrasts with the findings of existing randomized controlled trials and systematic reviews, which conclude insufficient or low-quality evidence of effectiveness. CONCLUSIONS This study demonstrated the value of combining NLP with linguistic techniques for analyzing large unstructured text datasets. Despite varying opinions, findings were methodologically consistent and challenged existing clinical evidence. This suggests that patient-generated data could potentially provide valuable insights into treatment outcomes, potentially improving clinical understanding and patient care.
Collapse
Affiliation(s)
- Julia Walsh
- Warwick Medical School, University of Warwick, Coventry, United Kingdom
| | - Jonathan Cave
- Department of Economics, University of Warwick, Coventry, United Kingdom
| | - Frances Griffiths
- Warwick Medical School, University of Warwick, Coventry, United Kingdom
- Centre for Health Policy, University of the Witwatersrand, Johannesburg, South Africa
| |
Collapse
|
2
|
Chandrasekaran R, Kotaki S, Nagaraja AH. Detecting and tracking depression through temporal topic modeling of tweets: insights from a 180-day study. NPJ MENTAL HEALTH RESEARCH 2024; 3:62. [PMID: 39643656 PMCID: PMC11624259 DOI: 10.1038/s44184-024-00107-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Accepted: 11/24/2024] [Indexed: 12/09/2024]
Abstract
Depression affects over 280 million people globally, yet many cases remain undiagnosed or untreated due to stigma and lack of awareness. Social media platforms like X (formerly Twitter) offer a way to monitor and analyze depression markers. This study analyzes Twitter data 90 days before and 90 days after a self-disclosed clinical diagnosis. We gathered 246,637 tweets from 229 diagnosed users. CorEx topic modeling identified seven themes: causes, physical symptoms, mental symptoms, swear words, treatment, coping/support mechanisms, and lifestyle, and conditional logistic regression assessed the odds of these themes occurring post-diagnosis. A control group of healthy users (284,772 tweets) was used to develop and evaluate machine learning classifiers-support vector machines, naive Bayes, and logistic regression-to distinguish between depressed and non-depressed users. Logistic regression and SVM performed best. These findings show the potential of Twitter data for tracking depression and changes in symptoms, coping mechanisms, and treatment use.
Collapse
Affiliation(s)
- Ranganathan Chandrasekaran
- Department of Information & Decision Sciences, University of Illinois at Chicago, Chicago, IL, USA.
- Department of Biomedical and Health Information Sciences, University of Illinois at Chicago, Chicago, IL, USA.
| | - Suhas Kotaki
- Department of Information & Decision Sciences, University of Illinois at Chicago, Chicago, IL, USA
| | | |
Collapse
|
3
|
de Anta L, Alvarez-Mon MÁ, Pereira-Sanchez V, Donat-Vargas CC, Lara-Abelenda FJ, Arrieta M, Montero-Torres M, García-Montero C, Fraile-Martínez Ó, Mora F, Ortega MÁ, Alvarez-Mon M, Quintero J. Assessment of beliefs and attitudes towards benzodiazepines using machine learning based on social media posts: an observational study. BMC Psychiatry 2024; 24:659. [PMID: 39379861 PMCID: PMC11462674 DOI: 10.1186/s12888-024-06111-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Accepted: 09/24/2024] [Indexed: 10/10/2024] Open
Abstract
BACKGROUND Benzodiazepines are frequently prescribed drugs; however, their prolonged use can lead to tolerance, dependence, and other adverse effects. Despite these risks, long-term use remains common, presenting a public health concern. This study aims to explore the beliefs and opinions held by the public regarding benzodiazepines, as understanding these perspectives may provide insights into their usage patterns. METHODS We collected public tweets published in English between January 1, 2019, and October 31, 2020, that mentioned benzodiazepines. The content of each tweet and the characteristics of the users were analyzed using a mixed-method approach, including manual analysis and semi-supervised machine learning. RESULTS Over half of the Twitter users highlighted the efficacy of benzodiazepines, with minimal discussion of their side effects. The most active participants in these conversations were patients and their families, with health professionals and institutions being notably absent. Additionally, the drugs most frequently mentioned corresponded with those most commonly prescribed by healthcare professionals. CONCLUSIONS Social media platforms offer valuable insights into users' experiences and opinions regarding medications. Notably, the sentiment towards benzodiazepines is predominantly positive, with users viewing them as effective while rarely mentioning side effects. This analysis underscores the need to educate physicians, patients, and their families about the potential risks associated with benzodiazepine use and to promote clinical guidelines that support the proper management of these medications. CLINICAL TRIAL NUMBER Not applicable.
Collapse
Affiliation(s)
- Laura de Anta
- Department of Psychiatry and Mental Health, Hospital Universitario Infanta Leonor, Madrid, Spain
- Department of Medicine and Medical Specialities, University of Alcala, Alcala de Henares, Madrid, 28801, Spain
| | - Miguel Ángel Alvarez-Mon
- Department of Psychiatry and Mental Health, Hospital Universitario Infanta Leonor, Madrid, Spain
- Department of Medicine and Medical Specialities, University of Alcala, Alcala de Henares, Madrid, 28801, Spain
- Ramón y Cajal Institute of Sanitary Research (IRYCIS), Madrid, 28034, Spain
| | - Victor Pereira-Sanchez
- Department of Child and Adolescent Psychiatry, NYU Grossman School of Medicine, New York, NY, USA
| | - Carolina C Donat-Vargas
- ISGlobal, Barcelona, Spain
- CIBER Epidemiología y Salud Pública (CIBERESP), Madrid, Spain
- Unit of Cardiovascular and Nutritional Epidemiology, Institute of Environmental Medicine, Karolinska Institute, Stockholm, Sweden
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Francisco J Lara-Abelenda
- Department of Medicine and Medical Specialities, University of Alcala, Alcala de Henares, Madrid, 28801, Spain
- Department of Signal Theory and Communications, Rey Juan Carlos University, Fuenlabrada, Madrid, 28942, Spain
| | - María Arrieta
- Department of Psychiatry and Mental Health, Hospital Universitario Infanta Leonor, Madrid, Spain
| | - María Montero-Torres
- Departamento de Ingeniería Electrónica, Universidad Politécnica de Madrid, Madrid, Spain
| | - Cielo García-Montero
- Department of Medicine and Medical Specialities, University of Alcala, Alcala de Henares, Madrid, 28801, Spain
- Ramón y Cajal Institute of Sanitary Research (IRYCIS), Madrid, 28034, Spain
| | - Óscar Fraile-Martínez
- Department of Medicine and Medical Specialities, University of Alcala, Alcala de Henares, Madrid, 28801, Spain.
- Ramón y Cajal Institute of Sanitary Research (IRYCIS), Madrid, 28034, Spain.
| | - Fernando Mora
- Department of Psychiatry and Mental Health, Hospital Universitario Infanta Leonor, Madrid, Spain
- Department of Legal and Psychiatry, Complutense University, Madrid, Spain
| | - Miguel Ángel Ortega
- Department of Medicine and Medical Specialities, University of Alcala, Alcala de Henares, Madrid, 28801, Spain
- Ramón y Cajal Institute of Sanitary Research (IRYCIS), Madrid, 28034, Spain
| | - Melchor Alvarez-Mon
- Department of Medicine and Medical Specialities, University of Alcala, Alcala de Henares, Madrid, 28801, Spain
- Ramón y Cajal Institute of Sanitary Research (IRYCIS), Madrid, 28034, Spain
| | - Javier Quintero
- Department of Psychiatry and Mental Health, Hospital Universitario Infanta Leonor, Madrid, Spain
- Department of Legal and Psychiatry, Complutense University, Madrid, Spain
| |
Collapse
|
4
|
Babanejaddehaki G, An A, Davoudi H. Ontology-Based Data Collection for a Hybrid Outbreak Detection Method Using Social Media. IEEE Trans Nanobioscience 2024; 23:591-602. [PMID: 39137072 DOI: 10.1109/tnb.2024.3442912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/15/2024]
Abstract
Given the persistent global challenge presented by rapidly spreading diseases, as evidenced notably by the widespread impact of the COVID-19 pandemic on both human health and economies worldwide, the necessity of developing effective infectious disease prediction models has become of utmost importance. In this context, the utilization of online social media platforms as valuable tools in healthcare settings has gained prominence, offering direct avenues for disseminating critical health information to the public in a timely and accessible manner. Propelled by the ubiquitous accessibility of the internet through computers and mobile devices, these platforms promise to revolutionize traditional detection methods, providing more immediate and reliable epidemiological insights. Leveraging this paradigm shift, our proposed framework harnesses Twitter data associated with infectious disease symptoms, employing ontology to identify and curate relevant tweets. Central to our methodology is a hybrid model that integrates XGBoost and Bidirectional Long Short-Term Memory (BiLSTM) architectures. The integration of XGBoost addresses the challenge of handling small dataset sizes, inherent during outbreaks due to limited time series data. XGBoost serves as a cornerstone for minimizing the loss function and identifying optimal features from our multivariate time series data. Subsequently, the combined dataset, comprising original features and predicted values by XGBoost, is channeled into the BiLSTM for further processing. Through extensive experimentation with a dataset spanning multiple infectious disease outbreaks, our hybrid model demonstrates superior predictive performance compared to state-of-the-art and baseline models. By enhancing forecasting accuracy and outbreak tracking capabilities, our model offers promising prospects for assisting health authorities in mitigating fatalities and proactively preparing for potential outbreaks.
Collapse
|
5
|
Lane JM, Zhang X, Alcala CS, Midya V, Nagdeo K, Li R, Wright RO. Tweeting environmental pollution: Analyzing twitter language to uncover its correlation with county-level obesity rates in the United States. Prev Med 2024; 186:108081. [PMID: 39038770 DOI: 10.1016/j.ypmed.2024.108081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Revised: 07/17/2024] [Accepted: 07/18/2024] [Indexed: 07/24/2024]
Abstract
BACKGROUND Environmental pollution has been linked to obesogenic tendencies. Using environmental-related posts from Twitter (now known as X) from U.S. counties, we aim to uncover the association between Twitter linguistic data and U.S. county-level obesity rates. METHODS Analyzing nearly 300 thousand tweets from January 2020 to December 2020 across 207 U.S. counties, using an innovative Differential Language Analysis technique and drawing county-level obesity data from the 2020 Food Environment Atlas to identify distinct linguistic features in Twitter relating to environmental-related posts correlated with socioeconomic status (SES) index indicators, obesity rates, and obesity rates controlled for SES index indicators. We also employed predictive modeling to estimate Twitter language's predictive capacity for obesity rates. RESULTS Results revealed a negative correlation between environmental-related tweets and obesity rates, both before and after adjusting for SES. Contrarily, non-environmental-related tweets showed a positive association with higher county-level obesity rates, indicating that individuals living in counties with lower obesity rates tend to tweet environmental-related language more frequently than those living in counties with higher obesity rates. The findings suggest that linguistic patterns and expressions employed in discussing environmental-related themes on Twitter can offer unique insights into the prevailing cross-sectional patterns of obesity rates. CONCLUSIONS Although Twitter users are a subset of the general population, incorporating environmental-related tweets and county-level obesity rates and using a novel language analysis technique make this study unique. Our results indicated that Twitter users engaging in more active dialog about environmental concerns might exhibit healthier lifestyle practices, contributing to reduced obesity rates.
Collapse
Affiliation(s)
- Jamil M Lane
- Department of Environmental Medicine and Climate Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| | - Xupin Zhang
- School of Economics and Management, East China Normal University, Shanghai, China
| | - Cecilia S Alcala
- Department of Environmental Medicine and Climate Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Vishal Midya
- Department of Environmental Medicine and Climate Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Kiran Nagdeo
- Department of Environmental Medicine and Climate Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Rui Li
- School of Economics and Management, East China Normal University, Shanghai, China
| | - Robert O Wright
- Department of Environmental Medicine and Climate Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Institute for Climate Change, Environmental Health, and Exposomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
6
|
Postma DJ, Heijkoop MLA, De Smet PAGM, Notenboom K, Leufkens HGM, Mantel-Teeuwisse AK. Identifying Medicine Shortages With the Twitter Social Network: Retrospective Observational Study. J Med Internet Res 2024; 26:e51317. [PMID: 39106483 PMCID: PMC11336501 DOI: 10.2196/51317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 05/10/2024] [Accepted: 06/21/2024] [Indexed: 08/09/2024] Open
Abstract
BACKGROUND Early identification is critical for mitigating the impact of medicine shortages on patients. The internet, specifically social media, is an emerging source of health data. OBJECTIVE This study aimed to explore whether a routine analysis of data from the Twitter social network can detect signals of a medicine shortage and serve as an early warning system and, if so, for which medicines or patient groups. METHODS Medicine shortages between January 31 and December 1, 2019, were collected from the Dutch pharmacists' society's national catalog Royal Dutch Pharmacists Association (KNMP) Farmanco. Posts on these shortages were collected by searching for the name, the active pharmaceutical ingredient, or the first word of the brand name of the medicines in shortage. Posts were then selected based on relevant keywords that potentially indicated a shortage and the percentage of shortages with at least 1 post was calculated. The first posts per shortage were analyzed for their timing (median number of days, including the IQR) versus the national catalog, also stratified by disease and medicine characteristics. The content of the first post per shortage was analyzed descriptively for its reporting stakeholder and the nature of the post. RESULTS Of the 341 medicine shortages, 102 (29.9%) were mentioned on Twitter. Of these 102 shortages, 18 (5.3% of the total) were mentioned prior to or simultaneous to publication by KNMP Farmanco. Only 4 (1.2%) of these were mentioned on Twitter more than 14 days before. On average, posts were published with a median delay of 37 (IQR 7-81) days to publication by KNMP Farmanco. Shortages mentioned on Twitter affected a greater number of patients and lasted longer than those that were not mentioned. We could not conclusively relate either the presence or absence on Twitter to a disease area or route of administration of the medicine in shortage. The first posts on the 102 shortages were mainly published by patients (n=51, 50.0%) and health care professionals (n=46, 45.1%). We identified 8 categories of nature of content. Sharing personal experience (n=44, 43.1%) was the most common category. CONCLUSIONS The Twitter social network is not a suitable early warning system for medicine shortages. Twitter primarily echoes already-known information rather than spreads new information. However, Twitter or potentially any other social media platform provides the opportunity for future qualitative research in the increasingly important field of medicine shortages that investigates how a larger population of patients is affected by shortages.
Collapse
Affiliation(s)
- Doerine J Postma
- Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences (UIPS), Utrecht University, Utrecht, Netherlands
- Royal Dutch Pharmacists Association, The Hague, Netherlands
| | - Magali L A Heijkoop
- Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences (UIPS), Utrecht University, Utrecht, Netherlands
| | - Peter A G M De Smet
- Departments of IQ Healthcare and Clinical Pharmacy, Radboud Institute for Health Sciences, Radboud University Medical Centre, Nijmegen, Netherlands
| | - Kim Notenboom
- Dutch Medicines Evaluation Board, Utrecht, Netherlands
| | - Hubert G M Leufkens
- Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences (UIPS), Utrecht University, Utrecht, Netherlands
| | - Aukje K Mantel-Teeuwisse
- Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences (UIPS), Utrecht University, Utrecht, Netherlands
| |
Collapse
|
7
|
Chen J, He J, Xie Z, Li D. Public perceptions and discussions of synthetic nicotine on Twitter. Front Public Health 2024; 12:1370076. [PMID: 39131569 PMCID: PMC11310114 DOI: 10.3389/fpubh.2024.1370076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2024] [Accepted: 07/15/2024] [Indexed: 08/13/2024] Open
Abstract
Background As alternative replacement products for tobacco-derived nicotine, synthetic nicotine products have recently emerged and gained increasing popularity. This study analyzes public perception and discussion of synthetic nicotine products on Twitter (now "X"). Methods Through Twitter streaming API (Application Programming Interface), we have collected 2,764 Twitter posts related to synthetic nicotine from December 12, 2021, to October 17, 2022, using keywords related to synthetic nicotine. By applying an inductive approach, two research assistants manually determined the relevance of tweets to synthetic nicotine products and assessed the attitude of tweets as positive, negative, and neutral of tweets toward synthetic nicotine, and the main topics. Results Among 1,007 tweets related to synthetic nicotine products, the proportion of negative tweets (383/1007, 38.03%) toward synthetic nicotine products was significantly higher than that of positive tweets (218/1007, 21.65%) with a p-value <0.05. Among negative tweets, major topics include the concern about addiction and health risks of synthetic nicotine products (44.91%) and synthetic nicotine as a policy loophole (31.85%). Among positive tweets, top topics include alternative replacement for nicotine (39.91%) and reduced health risks (31.19%). Conclusion There are mixed attitudes toward synthetic nicotine products on Twitter, resulting from different perspectives. Future research could incorporate demographic information to understand the attitudes of various population groups.
Collapse
Affiliation(s)
- Jiarui Chen
- Goergen Institute for Data Science, University of Rochester, Rochester, NY, United States
| | - Jinxi He
- Department of Computer Science, University of Rochester, Rochester, NY, United States
| | - Zidian Xie
- Department of Clinical and Translational Research, University of Rochester Medical Center, Rochester, NY, United States
| | - Dongmei Li
- Department of Clinical and Translational Research, University of Rochester Medical Center, Rochester, NY, United States
| |
Collapse
|
8
|
Kim M, Lovett JT, Doshi AM, Prabhu V. Immediate Access to Radiology Reports: Perspectives on X Before and After the Cures Act Information Blocking Provision. J Am Coll Radiol 2024; 21:1130-1140. [PMID: 38147904 DOI: 10.1016/j.jacr.2023.12.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2023] [Revised: 12/12/2023] [Accepted: 12/15/2023] [Indexed: 12/28/2023]
Abstract
OBJECTIVE The 21st Century Cures Act's information blocking provision mandates that patients have immediate access to their electronic health information, including radiology reports. We evaluated public opinions surrounding this policy on X, a microblogging platform with over 400 million users. METHODS We retrieved 27,522 posts related to radiology reports from October 5, 2020, through October 4, 2021. One reviewer performed initial screening for relevant posts. Two reviewers categorized user type and post theme(s) using a predefined coding system. Posts were grouped as "pre-Cures" (6 months before information blocking) and "post-Cures" (6 months after). Descriptive statistics and χ2 tests were performed. RESULTS Among 1,155 final posts, 1,028 unique users were identified (64% patients, 11% non-radiologist physicians, 4% radiologists). X activity increased, with 40% (n = 462) pre-Cures and 60% (n = 693) post-Cures. Early result notification before referring providers was the only theme that significantly increased post-Cures (+3%, P = .001). Common negative themes were frustration (33%), anxiety (27%), and delay (20%). Common positive themes were gratitude for radiologists (52%) and autonomy (21%). Of posts expressing opinions on early access, 84% favored and 16% opposed it, with decreased preference between study periods (P = .006). More patients than physicians preferred early access (92% versus 40%, P < .0001). DISCUSSION X activity increased after the information blocking provision, partly due to conversation about early notification. Despite negative experiences with reports, most users preferred early access. Although the Cures Act is a positive step toward open access, work remains to improve patients' engagement with their radiology results.
Collapse
Affiliation(s)
- Michelle Kim
- NYU Langone Health, Department of Radiology, New York, New York.
| | | | - Ankur M Doshi
- Associate Professor and Associate Clinical Director, Radiology Informatics, NYU Langone Health, Department of Radiology, New York, New York
| | - Vinay Prabhu
- Clinical Assistant Professor, Associate Program Director, and Body MRI Fellowship, NYU Langone Health, Department of Radiology, New York, New York
| |
Collapse
|
9
|
Janssen A, Donnelly C, Shaw T. A Taxonomy for Health Information Systems. J Med Internet Res 2024; 26:e47682. [PMID: 38820575 PMCID: PMC11179026 DOI: 10.2196/47682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 10/05/2023] [Accepted: 01/31/2024] [Indexed: 06/02/2024] Open
Abstract
The health sector is highly digitized, which is enabling the collection of vast quantities of electronic data about health and well-being. These data are collected by a diverse array of information and communication technologies, including systems used by health care organizations, consumer and community sources such as information collected on the web, and passively collected data from technologies such as wearables and devices. Understanding the breadth of IT that collect these data and how it can be actioned is a challenge for the significant portion of the digital health workforce that interact with health data as part of their duties but are not for informatics experts. This viewpoint aims to present a taxonomy categorizing common information and communication technologies that collect electronic data. An initial classification of key information systems collecting electronic health data was undertaken via a rapid review of the literature. Subsequently, a purposeful search of the scholarly and gray literature was undertaken to extract key information about the systems within each category to generate definitions of the systems and describe the strengths and limitations of these systems.
Collapse
Affiliation(s)
- Anna Janssen
- Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Candice Donnelly
- Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Tim Shaw
- Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| |
Collapse
|
10
|
Nelson V, Bashyal B, Tan PN, Argyris YA. Vaccine rhetoric on social media and COVID-19 vaccine uptake rates: A triangulation using self-reported vaccine acceptance. Soc Sci Med 2024; 348:116775. [PMID: 38579627 DOI: 10.1016/j.socscimed.2024.116775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 12/22/2023] [Accepted: 03/08/2024] [Indexed: 04/07/2024]
Abstract
The primary goal of this study is to examine the association between vaccine rhetoric on Twitter and the public's uptake rates of COVID-19 vaccines in the United States, compared to the extent of an association between self-reported vaccine acceptance and the CDC's uptake rates. We downloaded vaccine-related posts on Twitter in real-time daily for 13 months, from October 2021 to September 2022, collecting over half a billion tweets. A previously validated deep-learning algorithm was then applied to (1) filter out irrelevant tweets and (2) group the remaining relevant tweets into pro-, anti-, and neutral vaccine sentiments. Our results indicate that the tweet counts (combining all three sentiments) were significantly correlated with the uptake rates of all stages of COVID-19 shots (p < 0.01). The self-reported level of vaccine acceptance was not correlated with any of the stages of COVID-19 shots (p > 0.05) but with the daily new infection counts. These results suggest that although social media posts on vaccines may not represent the public's opinions, they are aligned with the public's behaviors of accepting vaccines, which is an essential step for developing interventions to increase the uptake rates. In contrast, self-reported vaccine acceptance represents the public's opinions, but these were not correlated with the behaviors of accepting vaccines. These outcomes provide empirical support for the validity of social media analytics for gauging the public's vaccination behaviors and understanding a nuanced perspective of the public's vaccine sentiment for health emergencies.
Collapse
Affiliation(s)
- Victoria Nelson
- Department of Advertising and Public Relations, College of Communication Arts and Sciences, Michigan State University, 404 Wilson Road, East Lansing, MI, 48864, USA.
| | - Bidhan Bashyal
- Department of Computer Science and Engineering, College of Engineering, Michigan State University, 428 S Shaw Lane, East Lansingm, MI, 48864, USA.
| | - Pang-Ning Tan
- Department of Computer Science and Engineering, College of Engineering, Michigan State University, 428 S Shaw Lane, East Lansingm, MI, 48864, USA.
| | - Young Anna Argyris
- Department of Media and Information, College of Communication Arts and Sciences, Michigan State University, 404 Wilson Road, East Lansing, MI, 48864, USA.
| |
Collapse
|
11
|
Bey R, Cohen A, Trebossen V, Dura B, Geoffroy PA, Jean C, Landman B, Petit-Jean T, Chatellier G, Sallah K, Tannier X, Bourmaud A, Delorme R. Natural language processing of multi-hospital electronic health records for public health surveillance of suicidality. NPJ MENTAL HEALTH RESEARCH 2024; 3:6. [PMID: 38609541 PMCID: PMC10955903 DOI: 10.1038/s44184-023-00046-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Accepted: 12/06/2023] [Indexed: 04/14/2024]
Abstract
There is an urgent need to monitor the mental health of large populations, especially during crises such as the COVID-19 pandemic, to timely identify the most at-risk subgroups and to design targeted prevention campaigns. We therefore developed and validated surveillance indicators related to suicidality: the monthly number of hospitalisations caused by suicide attempts and the prevalence among them of five known risks factors. They were automatically computed analysing the electronic health records of fifteen university hospitals of the Paris area, France, using natural language processing algorithms based on artificial intelligence. We evaluated the relevance of these indicators conducting a retrospective cohort study. Considering 2,911,920 records contained in a common data warehouse, we tested for changes after the pandemic outbreak in the slope of the monthly number of suicide attempts by conducting an interrupted time-series analysis. We segmented the assessment time in two sub-periods: before (August 1, 2017, to February 29, 2020) and during (March 1, 2020, to June 31, 2022) the COVID-19 pandemic. We detected 14,023 hospitalisations caused by suicide attempts. Their monthly number accelerated after the COVID-19 outbreak with an estimated trend variation reaching 3.7 (95%CI 2.1-5.3), mainly driven by an increase among girls aged 8-17 (trend variation 1.8, 95%CI 1.2-2.5). After the pandemic outbreak, acts of domestic, physical and sexual violence were more often reported (prevalence ratios: 1.3, 95%CI 1.16-1.48; 1.3, 95%CI 1.10-1.64 and 1.7, 95%CI 1.48-1.98), fewer patients died (p = 0.007) and stays were shorter (p < 0.001). Our study demonstrates that textual clinical data collected in multiple hospitals can be jointly analysed to compute timely indicators describing mental health conditions of populations. Our findings also highlight the need to better take into account the violence imposed on women, especially at early ages and in the aftermath of the COVID-19 pandemic.
Collapse
Affiliation(s)
- Romain Bey
- Innovation and Data unit, IT Department, Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Ariel Cohen
- Innovation and Data unit, IT Department, Assistance Publique-Hôpitaux de Paris, Paris, France.
| | - Vincent Trebossen
- Child and Adolescent Psychiatry Department, Robert Debré University Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Basile Dura
- Innovation and Data unit, IT Department, Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Pierre-Alexis Geoffroy
- Département de psychiatrie et d'addictologie, GHU Paris Nord, DMU neurosciences, Bichat - Claude Bernard Hospital, Assistance Publique-Hôpitaux de Paris, 75018, Paris, France
- GHU Paris - psychiatry & neurosciences, 1, rue Cabanis, 75014, Paris, France
- NeuroDiderot, Inserm, FHU I2-D2, université Paris Cité, 75019, Paris, France
- CNRS UPR 3212, Institute for cellular and integrative neurosciences, 67000, Strasbourg, France
| | - Charline Jean
- Innovation and Data unit, IT Department, Assistance Publique-Hôpitaux de Paris, Paris, France
- Université Paris-Est Créteil, INSERM, IMRB U955, Créteil, France
- Service Santé Publique & URC, Hôpital Henri Mondor, Assistance Publique-Hôpitaux de Paris, Créteil, France
| | - Benjamin Landman
- Child and Adolescent Psychiatry Department, Robert Debré University Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Thomas Petit-Jean
- Innovation and Data unit, IT Department, Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Gilles Chatellier
- Innovation and Data unit, IT Department, Assistance Publique-Hôpitaux de Paris, Paris, France
- Université Paris Cité, Paris, France
| | - Kankoe Sallah
- URC PNVS, CIC-EC 1425, INSERM, Bichat - Claude Bernard Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Xavier Tannier
- Sorbonne Université, Inserm, Université Sorbonne Paris Nord, Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances pour la e-Santé (LIMICS), Paris, France
| | - Aurelie Bourmaud
- Université Paris Cité, Paris, France
- Clinical Epidemiology Unit, Robert Debré University Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
- CIC 1426, Inserm, Paris, France
| | - Richard Delorme
- Child and Adolescent Psychiatry Department, Robert Debré University Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
- Human Genetics and Cognitive Functions, Institut Pasteur, Paris, France
| |
Collapse
|
12
|
Gyftopoulos S, Drosatos G, Fico G, Pecchia L, Kaldoudi E. Analysis of Pharmaceutical Companies' Social Media Activity during the COVID-19 Pandemic and Its Impact on the Public. Behav Sci (Basel) 2024; 14:128. [PMID: 38392481 PMCID: PMC10886074 DOI: 10.3390/bs14020128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 01/30/2024] [Accepted: 02/06/2024] [Indexed: 02/24/2024] Open
Abstract
The COVID-19 pandemic, a period of great turmoil, was coupled with the emergence of an "infodemic", a state when the public was bombarded with vast amounts of unverified information from dubious sources that led to a chaotic information landscape. The excessive flow of messages to citizens, combined with the justified fear and uncertainty imposed by the unknown virus, cast a shadow on the credibility of even well-intentioned sources and affected the emotional state of the public. Several studies highlighted the mental toll this environment took on citizens by analyzing their discourse on online social networks (OSNs). In this study, we focus on the activity of prominent pharmaceutical companies on Twitter, currently known as X, as well as the public's response during the COVID-19 pandemic. Communication between companies and users is examined and compared in two discrete channels, the COVID-19 and the non-COVID-19 channel, based on the content of the posts circulated in them in the period between March 2020 and September 2022, while the emotional profile of the content is outlined through a state-of-the-art emotion analysis model. Our findings indicate significantly increased activity in the COVID-19 channel compared to the non-COVID-19 channel while the predominant emotion in both channels is joy. However, the COVID-19 channel exhibited an upward trend in the circulation of fear by the public. The quotes and replies produced by the users, with a stark presence of negative charge and diffusion indicators, reveal the public's preference for promoting tweets conveying an emotional charge, such as fear, surprise, and joy. The findings of this research study can inform the development of communication strategies based on emotion-aware messages in future crises.
Collapse
Affiliation(s)
- Sotirios Gyftopoulos
- European Alliance for Medical and Biological Engineering and Science, 3001 Leuven, Belgium
- Institute for Language and Speech Processing, Athena Research Center, 67100 Xanthi, Greece
| | - George Drosatos
- European Alliance for Medical and Biological Engineering and Science, 3001 Leuven, Belgium
- Institute for Language and Speech Processing, Athena Research Center, 67100 Xanthi, Greece
| | - Giuseppe Fico
- European Alliance for Medical and Biological Engineering and Science, 3001 Leuven, Belgium
- Life Supporting Technologies, Universidad Politécnica de Madrid, 28040 Madrid, Spain
| | - Leandro Pecchia
- European Alliance for Medical and Biological Engineering and Science, 3001 Leuven, Belgium
- School of Engineering, University of Warwick, Coventry CV4 7AL, UK
- Department of Engineering, Università Campus Bio-Medico di Roma, 00128 Rome, Italy
| | - Eleni Kaldoudi
- European Alliance for Medical and Biological Engineering and Science, 3001 Leuven, Belgium
- Institute for Language and Speech Processing, Athena Research Center, 67100 Xanthi, Greece
- School of Medicine, Democritus University of Thrace, 68100 Alexandroupoli, Greece
| |
Collapse
|
13
|
Sanz-Martín D, Ubago-Jiménez JL, Cachón-Zagalaz J, Zurita-Ortega F. Impact of Physical Activity and Bio-Psycho-Social Factors on Social Network Addiction and Gender Differences in Spanish Undergraduate Education Students. Behav Sci (Basel) 2024; 14:110. [PMID: 38392463 PMCID: PMC10886106 DOI: 10.3390/bs14020110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2024] [Revised: 01/26/2024] [Accepted: 01/31/2024] [Indexed: 02/24/2024] Open
Abstract
Social network use has increased in recent years. Social networks are fast-changing and may cause negative effects such as dependence and addiction. Hence, it was decided to establish two research aims: (1) to identify the social network used by university students and their use levels according to their sex and (2) to analyse how age, body mass index, physical activity, emotional intelligence and social network type affect addiction to social networks according to young people's sex. A cross-sectional study was designed involving Spanish university students from Education Degrees. The mean age of the participants was 20.84 years (±2.90). Females made up 69.8% of the sample and males 30.2%. An online questionnaire was administered that included sociodemographic questions, IPAQ-SF and TMMS-24. This study found that all students use WhatsApp and more than 97% have YouTube and Instagram accounts. The linear regression model obtained was as follows: social network addiction = 3.355 + 0.336*emotional attention - 0.263*emotional clarity. There is a positive relationship between social network addiction and emotional attention (r = 0.25; p < 0.001) and negative relationships between social network addiction and emotional clarity (r = -0.16; p = 0.002) and between social network addiction and age (r = -0.17; p = 0.001). University students report lower levels of social network addiction and slightly higher levels of social network addiction among females. In addition, there are significant differences between the average social network addiction scores of university students in terms of their use of Telegram, TikTok and Twitch.
Collapse
Affiliation(s)
- Daniel Sanz-Martín
- Faculty of Humanities and Social Sciences, Universidad Isabel I, 09003 Burgos, Spain
- Musical, Plastic and Corporal Expression Didactics Department, University of Jaén, 23071 Jaén, Spain
| | - José Luis Ubago-Jiménez
- Department of Didactics Musical, Plastic and Corporal Expression, Faculty of Education Science, University of Granada, 18071 Granada, Spain
| | - Javier Cachón-Zagalaz
- Musical, Plastic and Corporal Expression Didactics Department, University of Jaén, 23071 Jaén, Spain
| | - Félix Zurita-Ortega
- Department of Didactics Musical, Plastic and Corporal Expression, Faculty of Education Science, University of Granada, 18071 Granada, Spain
| |
Collapse
|
14
|
Binmadi N. Oral Cancer and Twitter: An Analysis of Oral Cancer Awareness Month Tweets. Cureus 2024; 16:e54055. [PMID: 38348199 PMCID: PMC10860363 DOI: 10.7759/cureus.54055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/11/2024] [Indexed: 02/15/2024] Open
Abstract
OBJECTIVE The objective of this research was to assess Twitter usage during Oral Cancer Awareness Month and explore the content and engagement related to oral cancer. METHODS A comprehensive search was performed using relevant hashtags and keywords related to oral cancer on Twitter throughout the oral cancer awareness month, April 2022. All extracted tweets that match the inclusion criteria were analyzed for content, users were classified, and their countries were identified. RESULT A total of 5551 English tweets were identified during Oral Cancer Awareness Month, and 5543 were included in the analysis covering a wide range of oral cancer-related topics. The analyzed tweets encompassed a diverse range of topics, from cancer and oral health to oncology, cancer research, cancer awareness, and even discussions related to alcohol. We found that the majority of users who post on Twitter were individuals. The most common tweets were posted from the USA. CONCLUSIONS This study provides an analysis of Twitter activity during Oral Cancer Awareness Month, highlighting the diverse range of content being shared, offering valuable insights. The findings demonstrate the importance of leveraging social media platforms to disseminate information and raise awareness. With a strategic approach to social media, organizations and individuals worldwide have the power to amplify their message, attract attention, and effectively advocate for oral cancer awareness.
Collapse
Affiliation(s)
- Nada Binmadi
- Department of Oral Diagnostic Sciences, King Abdulaziz University, Jeddah, SAU
| |
Collapse
|
15
|
Mandava S, Oyer SL, Park SS. A quantitative analysis of Twitter ("X") trends in the discussion of rhinoplasty. Laryngoscope Investig Otolaryngol 2024; 9:e1227. [PMID: 38384363 PMCID: PMC10880128 DOI: 10.1002/lio2.1227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 02/04/2024] [Indexed: 02/23/2024] Open
Abstract
Introduction Rhinoplasty is one of the most common cosmetic surgical procedures performed globally. Twitter, also known as "X," is used by both patients and physicians and has been studied as a useful tool for analyzing trends in healthcare. The public social media discourse of rhinoplasty has not been previously reported in the field of otolaryngology. The goal of this study was to characterize the most common user type, sentiment, and temporal trends in the discussion of rhinoplasty on Twitter to guide facial plastic surgeons in their clinical and social media practices. Methods A total of 1,427,015 tweets published from 2015 to 2020 containing the keywords "rhinoplasty" or "nose job" were extracted using Twitter Academic API. Tweets were standardized and filtered for spam and duplication. Natural language processing (NLP) algorithms and data visualization techniques were applied to characterize tweets. Results Significantly more "nose job" tweets (80.8%) were published compared with "rhinoplasty" (19.2%). Annual tweet frequency increased over the 5 years, with "rhinoplasty" tweets peaking in January and "nose job" tweets peaking in the summer and winter months. Most "rhinoplasty" tweets were linked to a surgeon or medical practice source, while most "nose job" tweets were from isolated laypersons. While discussion was positive in sentiment overall (M = +0.08), "nose job" tweets had lower average sentiment scores (P < .001) and over twice the proportion of negative tweets. The top 20 most prolific accounts contributed to 14,758 (10.6%) of total "rhinoplasty" tweets. Exactly 90% (18/20) of those accounts linked to non-academic surgeons compared with 10% (2/20) linked to academic surgeons. Conclusions Rhinoplasty-related posts on Twitter were cumulatively positive in sentiment and tweet volume is steadily increasing over time, especially during popular holiday months. The search term "nose job" yields significantly more results than "rhinoplasty," and is the preferred term of non-healthcare users. We found a large digital contribution from surgeons and medical practices, particularly in the non-academic and private practice sector, utilizing Twitter for promotional purposes.
Collapse
Affiliation(s)
- Shreya Mandava
- School of MedicineUniversity of VirginiaCharlottesvilleVirginiaUSA
| | - Samuel L. Oyer
- Department of Otolaryngology‐Head and Neck SurgeryUniversity of VirginiaCharlottesvilleVirginiaUSA
| | - Stephen S. Park
- Department of Otolaryngology‐Head and Neck SurgeryUniversity of VirginiaCharlottesvilleVirginiaUSA
| |
Collapse
|
16
|
Mayor E, Bietti LM. Language use on Twitter reflects social structure and social disparities. Heliyon 2024; 10:e23528. [PMID: 38293550 PMCID: PMC10825303 DOI: 10.1016/j.heliyon.2023.e23528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Revised: 11/24/2023] [Accepted: 12/05/2023] [Indexed: 02/01/2024] Open
Abstract
Large-scale mental health assessments increasingly rely upon user-contributed social media data. It is widely known that mental health and well-being are affected by minority group membership and social disparity. But do these factors manifest in the language use of social media users? We elucidate this question using spatial lag regressions. We examined the county-level (N = 1069) associations of lexical indicators linked to well-being and mental health, notably depression (e.g., first-person singular pronouns, negative emotions) with markers of social disparity (e.g., the Area Deprivation Index-3) and ethnicity, using a sample of approximately 30 million content-coded tweets (U.S. county-level aggregation). Results confirmed most expected associations: County-level lexical indicators of depression are positively linked with county-level area disparity (e.g., economic hardship and inequity) and percentage of ethnic minority groups. Predictive validity checks show that lexical indicators are related to future health and mental health outcomes. Lexical indicators of depression and adjustment coded from tweets aggregated at the county level could play a crucial role in prioritizing public health campaigns, particularly in socially deprived counties.
Collapse
|
17
|
Bak M, Chin CL, Chin J. Use of Health Belief Model-based Deep Learning to Understand Public Health Beliefs in Breast Cancer Screening from Social Media before and during the COVID-19 Pandemic. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2024; 2023:280-288. [PMID: 38222395 PMCID: PMC10785880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Abstract
Breast cancer is the second leading cause of cancer death for women in the United States. While breast cancer screening participation is the most effective method for early detection, screening rate has remained low. Given that understanding health perception is critical to understand health decisions, our study utilized the Health Belief Model-based deep learning method to predict and examine public health beliefs in breast cancer and its screening behavior. The results showed that the trends in public health perception are sensitive to political (i.e., changes in health policy), sociological (i.e., representation of disease and its preventive care by public figure or organization), psychological (i.e., social support), and environmental factors (i.e., COVID-19 pandemic). Our study explores the roles social media can play in public health surveillance and in public health promotion of preventive care.
Collapse
Affiliation(s)
- Michelle Bak
- University of Illinois Urbana-Champaign, Champaign, IL, USA
| | - Chieh-Li Chin
- University of Illinois Urbana-Champaign, Champaign, IL, USA
| | - Jessie Chin
- University of Illinois Urbana-Champaign, Champaign, IL, USA
| |
Collapse
|
18
|
Haggerty T, Sedney CL, Cowher A, Holland D, Davisson L, Dekeseredy P. Twitter and Communicating Stigma about Medications to Treat Obesity. HEALTH COMMUNICATION 2023; 38:3238-3242. [PMID: 36373192 DOI: 10.1080/10410236.2022.2144303] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
In North America, stigma remains a significant barrier to treating obesity. Many candidates for medical weight management do not seek treatment, possibly related to anticipated and internalized stigma and weight bias. Pharmacologic treatment of obesity remains highly stigmatized, despite advances in drug development and medical weight management programs. People contemplating medical weight management are likely to see information about "diet pills" on social media sites, such as Twitter. However, Twitter has been found to contain false and stigmatizing information. This study examines a sample of 2170 Tweets to better understand the content through the lens of obesity stigma. Tweets were collected over a seven-day period containing general terms such as "diet pills," "weight loss pills," or "fat burner" using the Twitter advanced search option. The analysis revealed that almost 50% of Tweets containing "diet pills" contained stigmatizing language. The most common elements of stigma communication were taking personal blame for obesity and the perils associated with taking medications for weight loss. Further analysis revealed sub-themes such as profiting from social pressures to lose weight, distrust of physicians and the practice of obesity medicine, lack of efficacy of medications, and the use of social media to disseminate stigma. Most Tweets were from personal accounts followed by direct sales of weight loss supplements. The findings have potential implications for medically supervised weight management programs and may drive the need for more evidence-based social media messaging around obesity related healthcare.
Collapse
Affiliation(s)
- Treah Haggerty
- Department of Family Medicine, West Virginia University
- Department of Medicine, West Virginia University, WVU Medicine Medical and Surgical Weight Loss Center's Medical Weight Management Program
| | - Cara L Sedney
- Department of Neurosurgery, Rockefeller Neuroscience Institute, West Virginia University
| | | | - Dylan Holland
- Department of Family Medicine, West Virginia University
| | - Laura Davisson
- Department of Medicine, West Virginia University, WVU Medicine Medical and Surgical Weight Loss Center's Medical Weight Management Program
| | - Patricia Dekeseredy
- Department of Neurosurgery, Rockefeller Neuroscience Institute, West Virginia University
| |
Collapse
|
19
|
Ali SH, Lowery CM, Trude ACB. Leveraging Multiyear, Geospatial Social Media Data for Health Policy Evaluations: Lessons From the Philadelphia Beverage Tax. JOURNAL OF PUBLIC HEALTH MANAGEMENT AND PRACTICE 2023; 29:E253-E262. [PMID: 37467151 DOI: 10.1097/phh.0000000000001804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/21/2023]
Abstract
CONTEXT Public reactions to health policies are vital to understand policy sustainability and impact but have been elusively difficult to dynamically measure. The 2021 launch of the Twitter Academic Application Programming Interface (API), allowing for historical tweet analyses, represents a potentially powerful tool for complex, comprehensive policy analyses. OBJECTIVE Using the Philadelphia Beverage Tax (implemented January 2017) as a case study, this research extracted longitudinal and geographic changes in sentiments, and key influencers in policy-related conversations. DESIGN The Twitter API was used to retrieve all publicly available tweets related to the Tax between 2016 and 2019. SETTING Twitter. PARTICIPANTS Users who posted publicly available tweets related to the Philadelphia Beverage Tax (PBT). MAIN OUTCOME Tweet content, frequency, sentiment, and user-related information. MEASURES Tweet content, authors, engagement, and location were analyzed in parallel to key PBT events. Published emotional lexicons were used for sentiment analyses. RESULTS A total of 45 891 tweets were retrieved (1311 with geolocation data). Changes in the tweet volume and sentiment were strongly driven by Tax-related litigation. While anger and fear increased in the months prior to the policy's implementation, they progressively decreased after its implementation; trust displayed an inverse trend. The 50 tweeters with the highest positive engagement included media outlets (n = 24), displaying particularly high tweet volume/engagement, and public personalities (n = 10), displaying the greatest polarization in tweet sentiment. Most geo-located tweets, reflecting 321 unique locations, were from the Philadelphia region (55.2%). Sentiment and positive engagement varied, although concentrations of negative sentiments were observed in some Philadelphia suburbs. CONCLUSIONS Findings highlighted how longitudinal Twitter data can be leveraged to deconstruct specific, dynamic insights on public policy reactions and information dissemination to inform better policy implementation and evaluation (eg, anticipating catalysts for both heightened public interest and geographic, sentiment changes in policy conversations). This study provides policymakers a blueprint to conduct similar cost and time efficient yet dynamic and multifaceted health policy evaluations.
Collapse
Affiliation(s)
- Shahmir H Ali
- Department of Social and Behavioral Sciences, New York University School of Global Public Health, New York, New York (Dr Ali); Department of Nutrition, University of North Carolina at Chapel Hill, North Carolina (Ms Lowery); and Department of Nutrition and Food Studies, New York University Steinhardt School of Culture, Education, and Human Development, New York, New York (Dr Trude)
| | | | | |
Collapse
|
20
|
Giorgi S, Eichstaedt JC, Preoţiuc-Pietro D, Gardner JR, Schwartz HA, Ungar LH. Filling in the white space: Spatial interpolation with Gaussian processes and social media data. CURRENT RESEARCH IN ECOLOGICAL AND SOCIAL PSYCHOLOGY 2023; 5:100159. [PMID: 38125747 PMCID: PMC10732585 DOI: 10.1016/j.cresp.2023.100159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Full national coverage below the state level is difficult to attain through survey-based data collection. Even the largest survey-based data collections, such as the CDC's Behavioral Risk Factor Surveillance System or the Gallup-Healthways Well-being Index (both with more than 300,000 responses p.a.) only allow for the estimation of annual averages for about 260 out of roughly U.S. 3,000 counties when a threshold of 300 responses per county is used. Using a relatively high threshold of 300 responses gives substantially higher convergent validity-higher correlations with health variables-than lower thresholds but covers a reduced and biased sample of the population. We present principled methods to interpolate spatial estimates and show that including large-scale geotagged social media data can increase interpolation accuracy. In this work, we focus on Gallup-reported life satisfaction, a widely-used measure of subjective well-being. We use Gaussian Processes (GP), a formal Bayesian model, to interpolate life satisfaction, which we optimally combine with estimates from low-count data. We interpolate over several spaces (geographic and socioeconomic) and extend these evaluations to the space created by variables encoding language frequencies of approximately 6 million geotagged Twitter users. We find that Twitter language use can serve as a rough aggregate measure of socioeconomic and cultural similarity, and improves upon estimates derived from a wide variety of socioeconomic, demographic, and geographic similarity measures. We show that applying Gaussian Processes to the limited Gallup data allows us to generate estimates for a much larger number of counties while maintaining the same level of convergent validity with external criteria (i.e., N = 1,133 vs. 2,954 counties). This work suggests that spatial coverage of psychological variables can be reliably extended through Bayesian techniques while maintaining out-of-sample prediction accuracy and that Twitter language adds important information about cultural similarity over and above traditional socio-demographic and geographic similarity measures. Finally, to facilitate the adoption of these methods, we have also open-sourced an online tool that researchers can freely use to interpolate their data across geographies.
Collapse
Affiliation(s)
- Salvatore Giorgi
- Department of Computer and Information Science, University of Pennsylvania, United States of America
| | - Johannes C. Eichstaedt
- Department of Psychology & Institute for Human-Centered AI, Stanford University, United States of America
| | | | - Jacob R. Gardner
- Department of Computer and Information Science, University of Pennsylvania, United States of America
| | - H. Andrew Schwartz
- Department of Computer Science, Stony Brook University, United States of America
| | - Lyle H. Ungar
- Department of Computer and Information Science, University of Pennsylvania, United States of America
| |
Collapse
|
21
|
Yan Q, Shan S, Zhang B, Sun W, Sun M, Luo Y, Zhao F, Guo X. Monitoring the Relationship between Social Network Status and Influenza Based on Social Media Data. Disaster Med Public Health Prep 2023; 17:e490. [PMID: 37721020 DOI: 10.1017/dmp.2023.117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/19/2023]
Abstract
BACKGROUND This article aims to analyze the relationship between user characteristics on social networks and influenza. METHODS Three specific research questions are investigated: (1) we classify Weibo updates to recognize influenza-related information based on machine learning algorithms and propose a quantitative model for influenza susceptibility in social networks; (2) we adopt in-degree indicator from complex networks theory as social media status to verify its coefficient correlation with influenza susceptibility; (3) we also apply the LDA topic model to explore users' physical condition from Weibo to further calculate its coefficient correlation with influenza susceptibility. From the perspective of social networking status, we analyze and extract influenza-related information from social media, with many advantages including efficiency, low cost, and real time. RESULTS We find a moderate negative correlation between the susceptibility of users to influenza and social network status, while there is a significant positive correlation between physical condition and susceptibility to influenza. CONCLUSIONS Our findings reveal the laws behind the phenomenon of online disease transmission, and providing important evidence for analyzing, predicting, and preventing disease transmission. Also, this study provides theoretical and methodological underpinnings for further exploration and measurement of more factors associated with infection control and public health from social networks.
Collapse
Affiliation(s)
- Qi Yan
- Management School, Tianjin Normal University, Tianjin, China
| | - Siqing Shan
- School of Economics and Management, Beihang University, Beijing, China
- Beijing Key Laboratory of Emergency Support Simulation Technologies for City Operation, Beijing, China
| | - Baishang Zhang
- Development Research Center of State Administration for Market Regulation of the PR China, Beijing, China
| | - Weize Sun
- School of Economics and Management, Beihang University, Beijing, China
- Beijing Key Laboratory of Emergency Support Simulation Technologies for City Operation, Beijing, China
| | - Menghan Sun
- School of Economics and Management, Beihang University, Beijing, China
- Beijing Key Laboratory of Emergency Support Simulation Technologies for City Operation, Beijing, China
| | - Yiting Luo
- School of Economics and Management, Beihang University, Beijing, China
- Beijing Key Laboratory of Emergency Support Simulation Technologies for City Operation, Beijing, China
| | - Feng Zhao
- School of Economics and Management, Beihang University, Beijing, China
- Beijing Key Laboratory of Emergency Support Simulation Technologies for City Operation, Beijing, China
| | - Xiaoshuang Guo
- School of Economics and Management, Beihang University, Beijing, China
- Beijing Key Laboratory of Emergency Support Simulation Technologies for City Operation, Beijing, China
| |
Collapse
|
22
|
Lösch L, Zuiderent-Jerak T, Kunneman F, Syurina E, Bongers M, Stein ML, Chan M, Willems W, Timen A. Capturing Emerging Experiential Knowledge for Vaccination Guidelines Through Natural Language Processing: Proof-of-Concept Study. J Med Internet Res 2023; 25:e44461. [PMID: 37610972 PMCID: PMC10503655 DOI: 10.2196/44461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2022] [Revised: 07/11/2023] [Accepted: 07/27/2023] [Indexed: 08/25/2023] Open
Abstract
BACKGROUND Experience-based knowledge and value considerations of health professionals, citizens, and patients are essential to formulate public health and clinical guidelines that are relevant and applicable to medical practice. Conventional methods for incorporating such knowledge into guideline development often involve a limited number of representatives and are considered to be time-consuming. Including experiential knowledge can be crucial during rapid guidance production in response to a pandemic but it is difficult to accomplish. OBJECTIVE This proof-of-concept study explored the potential of artificial intelligence (AI)-based methods to capture experiential knowledge and value considerations from existing data channels to make these insights available for public health guideline development. METHODS We developed and examined AI-based methods in relation to the COVID-19 vaccination guideline development in the Netherlands. We analyzed Dutch messages shared between December 2020 and June 2021 on social media and on 2 databases from the Dutch National Institute for Public Health and the Environment (RIVM), where experiences and questions regarding COVID-19 vaccination are reported. First, natural language processing (NLP) filtering techniques and an initial supervised machine learning model were developed to identify this type of knowledge in a large data set. Subsequently, structural topic modeling was performed to discern thematic patterns related to experiences with COVID-19 vaccination. RESULTS NLP methods proved to be able to identify and analyze experience-based knowledge and value considerations in large data sets. They provide insights into a variety of experiential knowledge that is difficult to obtain otherwise for rapid guideline development. Some topics addressed by citizens, patients, and professionals can serve as direct feedback to recommendations in the guideline. For example, a topic pointed out that although travel was not considered as a reason warranting prioritization for vaccination in the national vaccination campaign, there was a considerable need for vaccines for indispensable travel, such as cross-border informal caregiving, work or study, or accessing specialized care abroad. Another example is the ambiguity regarding the definition of medical risk groups prioritized for vaccination, with many citizens not meeting the formal priority criteria while being equally at risk. Such experiential knowledge may help the early identification of problems with the guideline's application and point to frequently occurring exceptions that might initiate a revision of the guideline text. CONCLUSIONS This proof-of-concept study presents NLP methods as viable tools to access and use experience-based knowledge and value considerations, possibly contributing to robust, equitable, and applicable guidelines. They offer a way for guideline developers to gain insights into health professionals, citizens, and patients' experience-based knowledge, especially when conventional methods are difficult to implement. AI-based methods can thus broaden the evidence and knowledge base available for rapid guideline development and may therefore be considered as an important addition to the toolbox of pandemic preparedness.
Collapse
Affiliation(s)
- Lea Lösch
- Athena Institute, Faculty of Science, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
| | - Teun Zuiderent-Jerak
- Athena Institute, Faculty of Science, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
| | - Florian Kunneman
- Department of Computer Science, Faculty of Science, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
| | - Elena Syurina
- Athena Institute, Faculty of Science, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
| | - Marloes Bongers
- Centre for Infectious Disease Control (CIb), National Institute for Public Health and the Environment (RIVM), Bilthoven, Netherlands
| | - Mart L Stein
- Centre for Infectious Disease Control (CIb), National Institute for Public Health and the Environment (RIVM), Bilthoven, Netherlands
| | - Michelle Chan
- Department of Computer Science, Faculty of Science, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
| | - Willemine Willems
- Athena Institute, Faculty of Science, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
| | - Aura Timen
- Athena Institute, Faculty of Science, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
- Centre for Infectious Disease Control (CIb), National Institute for Public Health and the Environment (RIVM), Bilthoven, Netherlands
- Department of Primary and Community Care, Radboud University Medical Centre, Nijmegen, Netherlands
| |
Collapse
|
23
|
Sinha GR, Larrison CR, Brooks I, Kursuncu U. Comparing Naturalistic Mental Health Expressions on Student Loan Debts Using Reddit and Twitter. JOURNAL OF EVIDENCE-BASED SOCIAL WORK (2019) 2023; 20:727-742. [PMID: 37461303 DOI: 10.1080/26408066.2023.2202668] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/20/2023]
Abstract
PURPOSE The primary objective of this study was to identify patterns in users' naturalistic expressions on student loans on two social media platforms. The secondary objective was to examine how these patterns, sentiments, and emotions associated with student loans differ in user posts indicating mental illness. MATERIAL AND METHOD Data for this study were collected from Reddit and Twitter (2009-2020, n = 85,664) using certain key terms of student loans along with first-person pronouns as a triangulating measure of posts by individuals. Unsupervised and supervised machine learning models were used to analyze the text data. RESULTS Results suggested 50 topics in reddit finance and 40 each in reddit mental health communities and Twitter. Statistically significant associations were found between mental illness statuses and sentiments and emotions. Posts expressing mental illness showed more negative sentiments and were more likely to express sadness and fear. DISCUSSION AND CONCLUSION Patterns in social media discussions indicate both academic and non-academic consequences of having student debt, including users' desire to know more about their debts. Interventions should address the skill and information gaps between what is desired by the borrowers and what is offered to them in understanding and managing their debts. Cognitive burden created by student debts manifest itself on social media and can be used as an important marker to develop a nuanced understanding of people's expressions on a variety of socioeconomic issues. Higher volumes of negative sentiments and emotions of sadness, fear, and anger warrant immediate attention of policymakers and practitioners to reduce the cognitive burden of student debts.
Collapse
Affiliation(s)
- Gaurav R Sinha
- School of Social Work, University of Georgia, Athens, Georgia, USA
| | - Christopher R Larrison
- School of Social Work, University of Illinois at Urbana-Champaign, Urbana-Champaign, USA
| | - Ian Brooks
- Center for Health Informatics, The PAHO/WHO Collaborating Center on Information Systems for Health, and School of Information Sciences, University of Illinois at Urbana-Champaign, Urbana-Champaign, USA
| | - Ugur Kursuncu
- J. Mack Robinson College of Business, Georgia State University, Atlanta, USA
| |
Collapse
|
24
|
Ellis JT, Reichel MP. Twitter trends in #Parasitology determined by text mining and topic modelling. CURRENT RESEARCH IN PARASITOLOGY & VECTOR-BORNE DISEASES 2023; 4:100138. [PMID: 37670843 PMCID: PMC10475476 DOI: 10.1016/j.crpvbd.2023.100138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 08/08/2023] [Accepted: 08/10/2023] [Indexed: 09/07/2023]
Abstract
This study investigated the emergence and use of Twitter, as of July 2023 being rebranded as X, as the main forum for social media communication in parasitology. A dataset of tweets was constructed using a keyword search of Twitter with the search terms 'malaria', 'Plasmodium', 'Leishmania', 'Trypanosoma', 'Toxoplasma' and 'Schistosoma' for the period from 2011 to 2020. Exploratory data analyses of tweet content were conducted, including language, usernames and hashtags. To identify parasitology topics of discussion, keywords and phrases were extracted using KeyBert and biterm topic modelling. The sentiment of tweets was analysed using VADER. The results show that the number of tweets including the keywords increased from 2011 (for malaria) and 2013 (for the others) to 2020, with the highest number of tweets being recorded in 2020. The maximum number of yearly tweets for Plasmodium, Leishmania, Toxoplasma, Trypanosoma and Schistosoma was recorded in 2020 (2804, 2161, 1570, 680 and 360 tweets, respectively). English was the most commonly used language for tweeting, although the percentage varied across the searches. In tweets mentioning Leishmania, only ∼37% were in English, with Spanish being more common. Across all the searches, Portuguese was another common language found. Popular tweets on Toxoplasma contained keywords relating to mental health including depression, anxiety and schizophrenia. The Trypanosoma tweets referenced drugs (benznidazole, nifurtimox) and vectors (bugs, triatomines, tsetse), while the Schistosoma tweets referenced areas of biology including pathology, eggs and snails. A wide variety of individuals and organisations were shown to be associated with Twitter activity. Many journals in the parasitology arena regularly tweet about publications from their journal, and professional societies promote activity and events that are important to them. These represent examples of trusted sources of information, often by experts in their fields. Social media activity of influencers, however, who have large numbers of followers, might have little or no training in science. The existence of such tweeters does raise cause for concern to parasitology, as one may start to question the quality of information being disseminated.
Collapse
Affiliation(s)
- John T. Ellis
- School of Life Sciences, University of Technology Sydney, Broadway, NSW, Australia
| | - Michael P. Reichel
- Department of Population Medicine & Diagnostic Sciences, College of Veterinary Medicine, Cornell University, Ithaca, NY, USA
| |
Collapse
|
25
|
Kober SE, Buchrieser F, Wood G. Neurofeedback on twitter: Evaluation of the scientific credibility and communication about the technique. Heliyon 2023; 9:e18931. [PMID: 37600360 PMCID: PMC10432958 DOI: 10.1016/j.heliyon.2023.e18931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Revised: 07/31/2023] [Accepted: 08/02/2023] [Indexed: 08/22/2023] Open
Abstract
Neurofeedback is a popular technique to induce neuroplasticity with a controversial reputation. The public discourse on neurofeedback, as a therapeutic and neuroenhancement technique, encompasses scientific communication, therapeutic expectations and outcomes, as well as complementary and alternative practices. We investigated twitter publications from 2010 to 2022 on the keyword "neurofeedback". A total of over 138 k tweets were obtained, which originated from over 42 k different users. The communication flow in the neurofeedback community is mainly unidirectional and non-interactive. Analysis of hashtags revealed application fields, therapy provider and neuroenhancement to be the most popular contents in neurofeedback communication. A group of 1221 productive users was identified, in which clinicians, entrepreneurs, broadcasters, and scientists contribute. We identified reactions to critical publications in the twitter traffic and an increase in the number of tweets by academic users which suggest an increase in the interest on the scientific credibility of neurofeedback. More intense scientific communication on neurofeedback in twitter may contribute to promote a more realistic view on challenges and advances regarding good scientific practice of neurofeedback.
Collapse
|
26
|
Giorgi S, Yaden DB, Eichstaedt JC, Ungar LH, Schwartz HA, Kwarteng A, Curtis B. Predicting U.S. county opioid poisoning mortality from multi-modal social media and psychological self-report data. Sci Rep 2023; 13:9027. [PMID: 37270657 PMCID: PMC10238775 DOI: 10.1038/s41598-023-34468-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 04/30/2023] [Indexed: 06/05/2023] Open
Abstract
Opioid poisoning mortality is a substantial public health crisis in the United States, with opioids involved in approximately 75% of the nearly 1 million drug related deaths since 1999. Research suggests that the epidemic is driven by both over-prescribing and social and psychological determinants such as economic stability, hopelessness, and isolation. Hindering this research is a lack of measurements of these social and psychological constructs at fine-grained spatial and temporal resolutions. To address this issue, we use a multi-modal data set consisting of natural language from Twitter, psychometric self-reports of depression and well-being, and traditional area-based measures of socio-demographics and health-related risk factors. Unlike previous work using social media data, we do not rely on opioid or substance related keywords to track community poisonings. Instead, we leverage a large, open vocabulary of thousands of words in order to fully characterize communities suffering from opioid poisoning, using a sample of 1.5 billion tweets from 6 million U.S. county mapped Twitter users. Results show that Twitter language predicted opioid poisoning mortality better than factors relating to socio-demographics, access to healthcare, physical pain, and psychological well-being. Additionally, risk factors revealed by the Twitter language analysis included negative emotions, discussions of long work hours, and boredom, whereas protective factors included resilience, travel/leisure, and positive emotions, dovetailing with results from the psychometric self-report data. The results show that natural language from public social media can be used as a surveillance tool for both predicting community opioid poisonings and understanding the dynamic social and psychological nature of the epidemic.
Collapse
Affiliation(s)
- Salvatore Giorgi
- National Institute on Drug Abuse, Intramural Research Program, Baltimore, MD, USA
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - David B Yaden
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Johannes C Eichstaedt
- Department of Psychology, Stanford University, Stanford, CA, USA
- Institute for Human-Centered AI, Stanford University, Stanford, CA, USA
| | - Lyle H Ungar
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - H Andrew Schwartz
- Department of Computer Science, Stony Brook University, Stony Brook, NY, USA
| | - Amy Kwarteng
- National Institute on Drug Abuse, Intramural Research Program, Baltimore, MD, USA
| | - Brenda Curtis
- National Institute on Drug Abuse, Intramural Research Program, Baltimore, MD, USA.
| |
Collapse
|
27
|
Pérez-Pérez M, Ferreira T, Igrejas G, Fdez-Riverola F. A novel gluten knowledge base of potential biomedical and health-related interactions extracted from the literature: using machine learning and graph analysis methodologies to reconstruct the bibliome. J Biomed Inform 2023:104398. [PMID: 37230405 DOI: 10.1016/j.jbi.2023.104398] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 05/12/2023] [Accepted: 05/15/2023] [Indexed: 05/27/2023]
Abstract
BACKGROUND In return for their nutritional properties and broad availability, cereal crops have been associated with different alimentary disorders and symptoms, with the majority of the responsibility being attributed to gluten. Therefore, the research of gluten-related literature data continues to be produced at ever-growing rates, driven in part by the recent exploratory studies that link gluten to non-traditional diseases and the popularity of gluten-free diets, making it increasingly difficult to access and analyse practical and structured information. In this sense, the accelerated discovery of novel advances in diagnosis and treatment, as well as exploratory studies, produce a favourable scenario for disinformation and misinformation. OBJECTIVES Aligned with, the European Union strategy "Delivering on EU Food Safety and Nutrition in 2050" which emphasizes the inextricable links between imbalanced diets, the increased exposure to unreliable sources of information and misleading information, and the increased dependency on reliable sources of information; this paper presents GlutKNOIS, a public and interactive literature-based database that reconstructs and represents the experimental biomedical knowledge extracted from the gluten-related literature. The developed platform includes different external database knowledge, bibliometrics statistics and social media discussion to propose a novel and enhanced way to search, visualise and analyse potential biomedical and health-related interactions in relation to the gluten domain. METHODS For this purpose, the presented study applies a semi-supervised curation workflow that combines natural language processing techniques, machine learning algorithms, ontology-based normalization and integration approaches, named entity recognition methods, and graph knowledge reconstruction methodologies to process, classify, represent and analyse the experimental findings contained in the literature, which is also complemented by data from the social discussion. RESULTS and Conclusions: In this sense, 5,814 documents were manually annotated and 7,424 were fully automatically processed to reconstruct the first online gluten-related knowledge database of evidenced health-related interactions that produce health or metabolic changes based on the literature. In addition, the automatic processing of the literature combined with the knowledge representation methodologies proposed has the potential to assist in the revision and analysis of years of gluten research. The reconstructed knowledge base is public and accessible at https://sing-group.org/glutknois/.
Collapse
Affiliation(s)
- Martín Pérez-Pérez
- CINBIO, Universidade de Vigo, Department of Computer Science, ESEI - Escuela Superior de Ingeniería Informática, 32004 Ourense, España; SING Research Group, Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, Spain.
| | - Tânia Ferreira
- Department of Genetics and Biotechnology, University of Trás-os-Montes and Alto Douro, Vila Real, Portugal; Functional Genomics and Proteomics Unit, University of Trás-os-Montes and Alto Douro, Vila Real, Portugal.
| | - Gilberto Igrejas
- Department of Genetics and Biotechnology, University of Trás-os-Montes and Alto Douro, Vila Real, Portugal; Functional Genomics and Proteomics Unit, University of Trás-os-Montes and Alto Douro, Vila Real, Portugal; LAQV-REQUIMTE, Faculty of Science and Technology, Nova University of Lisbon, Lisbon, Portugal.
| | - Florentino Fdez-Riverola
- CINBIO, Universidade de Vigo, Department of Computer Science, ESEI - Escuela Superior de Ingeniería Informática, 32004 Ourense, España; SING Research Group, Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, Spain.
| |
Collapse
|
28
|
Di Cara NH, Maggio V, Davis OSP, Haworth CMA. Methodologies for Monitoring Mental Health on Twitter: Systematic Review. J Med Internet Res 2023; 25:e42734. [PMID: 37155236 PMCID: PMC10203928 DOI: 10.2196/42734] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 11/23/2022] [Accepted: 03/15/2023] [Indexed: 05/10/2023] Open
Abstract
BACKGROUND The use of social media data to predict mental health outcomes has the potential to allow for the continuous monitoring of mental health and well-being and provide timely information that can supplement traditional clinical assessments. However, it is crucial that the methodologies used to create models for this purpose are of high quality from both a mental health and machine learning perspective. Twitter has been a popular choice of social media because of the accessibility of its data, but access to big data sets is not a guarantee of robust results. OBJECTIVE This study aims to review the current methodologies used in the literature for predicting mental health outcomes from Twitter data, with a focus on the quality of the underlying mental health data and the machine learning methods used. METHODS A systematic search was performed across 6 databases, using keywords related to mental health disorders, algorithms, and social media. In total, 2759 records were screened, of which 164 (5.94%) papers were analyzed. Information about methodologies for data acquisition, preprocessing, model creation, and validation was collected, as well as information about replicability and ethical considerations. RESULTS The 164 studies reviewed used 119 primary data sets. There were an additional 8 data sets identified that were not described in enough detail to include, and 6.1% (10/164) of the papers did not describe their data sets at all. Of these 119 data sets, only 16 (13.4%) had access to ground truth data (ie, known characteristics) about the mental health disorders of social media users. The other 86.6% (103/119) of data sets collected data by searching keywords or phrases, which may not be representative of patterns of Twitter use for those with mental health disorders. The annotation of mental health disorders for classification labels was variable, and 57.1% (68/119) of the data sets had no ground truth or clinical input on this annotation. Despite being a common mental health disorder, anxiety received little attention. CONCLUSIONS The sharing of high-quality ground truth data sets is crucial for the development of trustworthy algorithms that have clinical and research utility. Further collaboration across disciplines and contexts is encouraged to better understand what types of predictions will be useful in supporting the management and identification of mental health disorders. A series of recommendations for researchers in this field and for the wider research community are made, with the aim of enhancing the quality and utility of future outputs.
Collapse
Affiliation(s)
- Nina H Di Cara
- School of Psychological Science, University of Bristol, Bristol, United Kingdom
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, United Kingdom
| | - Valerio Maggio
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, United Kingdom
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - Oliver S P Davis
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, United Kingdom
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom
- The Alan Turing Institute, London, United Kingdom
| | - Claire M A Haworth
- School of Psychological Science, University of Bristol, Bristol, United Kingdom
- The Alan Turing Institute, London, United Kingdom
| |
Collapse
|
29
|
Aruah DE, Henshaw Y, Walsh-Childers K. Tweets That Matter: Exploring the Solutions to Maternal Mortality in the United States Discussed by Advocacy Organizations on Twitter. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2023; 20:5617. [PMID: 37174137 PMCID: PMC10178367 DOI: 10.3390/ijerph20095617] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 04/14/2023] [Accepted: 04/21/2023] [Indexed: 05/15/2023]
Abstract
This study investigated maternal mortality solutions mentioned on Twitter by maternal health advocacy organizations in the United States. Using qualitative content analysis, we examined tweets from 20 advocacy organizations and found that the majority of the tweets focused on policy, healthcare, community, and individual solutions. The most tweeted policy solutions include tweets advocating signing birth equity, paid family leave, Medicaid expansion, and reproductive justice bills, whereas the most tweeted community solutions were funding community organizations, hiring community doulas, and building community health centers. The most tweeted individual solutions were storytelling, self-advocacy, and self-care. These findings provide insights into the perspectives and priorities of advocacy organizations working to address maternal mortality in the United States and can inform future efforts to combat this critical public health issue.
Collapse
Affiliation(s)
- Diane Ezeh Aruah
- Communication Department, Tennessee State University, Nashville, TN 37209, USA
| | | | - Kim Walsh-Childers
- College of Journalism and Communication, University of Florida, Gainesville, FL 32611, USA;
| |
Collapse
|
30
|
Coombs T, Abdelkader A, Ginige T, Van Calster P, Assi S. Understanding synthetic drug analogues among the homeless population from the perspectives of the public: thematic analysis of Twitter data. JOURNAL OF SUBSTANCE USE 2023. [DOI: 10.1080/14659891.2023.2173092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/11/2023]
Affiliation(s)
- Thomas Coombs
- Faculty of Science and Technology, Bournemouth University, Poole, UK
| | - Amor Abdelkader
- Faculty of Science and Technology, Bournemouth University, Poole, UK
| | - Tilak Ginige
- Faculty of Science and Technology, Bournemouth University, Poole, UK
| | | | - Sulaf Assi
- School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, UK
| |
Collapse
|
31
|
Ramanna S, Ashrafi N, Loster E, Debroni K, Turner S. Rough-set based learning: Assessing patterns and predictability of anxiety, depression, and sleep scores associated with the use of cannabinoid-based medicine during COVID-19. Front Artif Intell 2023; 6:981953. [PMID: 36872936 PMCID: PMC9975391 DOI: 10.3389/frai.2023.981953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 01/27/2023] [Indexed: 02/17/2023] Open
Abstract
Recently, research is emerging highlighting the potential of cannabinoids' beneficial effects related to anxiety, mood, and sleep disorders as well as pointing to an increased use of cannabinoid-based medicines since COVID-19 was declared a pandemic. The objective of this research is 3 fold: i) to evaluate the relationship of the clinical delivery of cannabinoid-based medicine for anxiety, depression and sleep scores by utilizing machine learning specifically rough set methods; ii) to discover patterns based on patient features such as specific cannabinoid recommendations, diagnosis information, decreasing/increasing levels of clinical assessment tools (CAT) scores over a period of time; and iii) to predict whether new patients could potentially experience either an increase or decrease in CAT scores. The dataset for this study was derived from patient visits to Ekosi Health Centres, Canada over a 2 year period including the COVID timeline. Extensive pre-processing and feature engineering was performed. A class feature indicative of their progress or lack thereof due to the treatment received was introduced. Six Rough/Fuzzy-Rough classifiers as well as Random Forest and RIPPER classifiers were trained on the patient dataset using a 10-fold stratified CV method. The highest overall accuracy, sensitivity and specificity measures of over 99% was obtained using the rule-based rough-set learning model. In this study, we have identified rough-set based machine learning model with high accuracy that could be utilized for future studies regarding cannabinoids and precision medicine.
Collapse
Affiliation(s)
- Sheela Ramanna
- Department of Applied Computer Science, University of Winnipeg, Winnipeg, MB, Canada
| | - Negin Ashrafi
- Department of Applied Computer Science, University of Winnipeg, Winnipeg, MB, Canada
| | - Evan Loster
- Ekosi Health Centre Corporation, Winnipeg, MB, Canada
| | - Karen Debroni
- Ekosi Health Centre Corporation, Winnipeg, MB, Canada
| | | |
Collapse
|
32
|
Abstract
OBJECTIVE To analyze tweets associated with Ménière's disease (MD), including type of users who engage, change in usage patterns, and temporal associations, and to compare the perceptions of the general public with healthcare providers. METHODS An R-program code, academictwitterR API, was used to query Twitter. All tweets mentioning MD from 2007 to 2021 were retrieved and analyzed. Valence Aware Dictionary and Sentiment Reasoning was used as a model to assess sentiment of tweets. Two reviewers assessed 1,007 tweets for qualitative analysis, identifying the source and the topic of the tweet. RESULTS A total of 37,402 tweets were analyzed. The number of tweets per user ranged from 1 to 563 (M = 33.7, SD = 91.1). Quantitative analysis showed no temporal or seasonal association; however, tweeting increased when celebrities were diagnosed with MD. Of the 1007 representative tweets analyzed, 60.6% of tweets came from the general public and were largely of negative sentiment focusing on quality of life and support, whereas healthcare providers accounted for 23% of all tweets and focused on treatment/prevention. Tweets by news sources accounted for the remaining 13% of all tweets and were primarily positive in sentiment and focused on awareness. CONCLUSIONS MD is commonly tweeted about by the general public, with limited input regarding the disease from healthcare providers. Healthcare providers must provide accurate information and awareness regarding MD, especially when awareness is highest, such as when celebrities are diagnosed. LEVEL OF EVIDENCE Level IV.Indicate IRB or IACUCNot applicable.
Collapse
|
33
|
León-Quismondo J. Social Sensing and Individual Brands in Sports: Lessons Learned from English-Language Reactions on Twitter to Pau Gasol's Retirement Announcement. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2023; 20:895. [PMID: 36673653 PMCID: PMC9859528 DOI: 10.3390/ijerph20020895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 12/23/2022] [Accepted: 12/31/2022] [Indexed: 06/17/2023]
Abstract
Pau Gasol announced his retirement on 5 October 2021. Subsequently, a number of users virtually reacted. Twitter is one of the most popular social media platforms, with more than 368 million active users, generating large-scale social data. This study used data from Twitter for analyzing social sensing related to an individual brand, Pau Gasol's retirement announcement, from a quantitative and qualitative content analysis perspective. Pau Gasol's farewell can be considered a unique event to which many people are emotionally attached, providing a great opportunity for understanding sports virtual ecosystems. A total of 2089 tweets in the English language were recovered from Tuesday 5 October 2021 at 3:00 to Thursday 7 October 2021 at 23:59, Greenwich Mean Time +00:00 time zone. During this time, posts were observed to be mainly influential during and right after Pau Gasol's ceremony. The tweets that created more impact were published by news sources or by sports reporters. Lastly, the themes that emerged showed that the Los Angeles Lakers and the NBA were the two most important milestones in Pau Gasol's career. The data can be used to detect potential areas of controversy or other issues to be addressed in order to preserve the athlete's public image. These results are considered of interest for reaching better knowledge of sport virtual environments through social sensing, supporting the idea of users acting as sensors.
Collapse
|
34
|
Jing F, Li Z, Qiao S, Zhang J, Olatosi B, Li X. Using geospatial social media data for infectious disease studies: a systematic review. INTERNATIONAL JOURNAL OF DIGITAL EARTH 2023; 16:130-157. [PMID: 37997607 PMCID: PMC10664840 DOI: 10.1080/17538947.2022.2161652] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Accepted: 12/17/2022] [Indexed: 11/25/2023]
Abstract
Geospatial social media (GSM) data has been increasingly used in public health due to its rich, timely, and accessible spatial information, particularly in infectious disease research. This review synthesized 86 research articles that use GSM data in infectious diseases published between December 2013 and March 2022. These articles cover 12 infectious disease types ranging from respiratory infectious diseases to sexually transmitted diseases with spatial levels varying from the neighborhood, county, state, and country. We categorized these studies into three major infectious disease research domains: surveillance, explanation, and prediction. With the assistance of advanced statistical and spatial methods, GSM data has been widely and deeply applied to these domains, particularly in surveillance and explanation domains. We further identified four knowledge gaps in terms of contextual information use, application scopes, spatiotemporal dimension, and data limitations and proposed innovation opportunities for future research. Our findings will contribute to a better understanding of using GSM data in infectious diseases studies and provide insights into strategies for using GSM data more effectively in future research.
Collapse
Affiliation(s)
- Fengrui Jing
- Geoinformation and Big Data Research Laboratory, Department of Geography, University of South Carolina, Columbia, SC, USA
- Big Data Health Science Center, University of South Carolina, Columbia, SC, USA
| | - Zhenlong Li
- Geoinformation and Big Data Research Laboratory, Department of Geography, University of South Carolina, Columbia, SC, USA
- Big Data Health Science Center, University of South Carolina, Columbia, SC, USA
| | - Shan Qiao
- Big Data Health Science Center, University of South Carolina, Columbia, SC, USA
- Department of Health Promotion, Education, and Behavior, Arnold School of Public Health, University of South Carolina, Columbia, SC, USA
| | - Jiajia Zhang
- Big Data Health Science Center, University of South Carolina, Columbia, SC, USA
- Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, Columbia, SC, USA
| | - Banky Olatosi
- Big Data Health Science Center, University of South Carolina, Columbia, SC, USA
- Department of Health Services Policy and Management, Arnold School of Public Health, University of South Carolina, Columbia, SC, USA
| | - Xiaoming Li
- Big Data Health Science Center, University of South Carolina, Columbia, SC, USA
- Department of Health Promotion, Education, and Behavior, Arnold School of Public Health, University of South Carolina, Columbia, SC, USA
| |
Collapse
|
35
|
Valdez D, Jozkowski KN, Montenegro MS, Crawford BL, Jackson F. Identifying accurate pro-choice and pro-life identity labels in Spanish: Social media insights and implications for comparative survey research. PERSPECTIVES ON SEXUAL AND REPRODUCTIVE HEALTH 2022; 54:166-176. [PMID: 36254620 PMCID: PMC10092859 DOI: 10.1363/psrh.12208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
INTRODUCTION Although debate remains about the saliency and relevance of pro-choice and pro-life labels (as abortion belief indicators), they have been consistently used for decades to broadly designate abortion identity. However, clear labels are less apparent in other languages (e.g., Spanish). Social media, as an exploratory data science tool, can be leveraged to identify the presence and popularity of online abortion identity labels and how they are contextualized online. PURPOSE This study aims to determine how popularly used Spanish-language pro-choice and pro-life identity labels are contextualized online. METHOD We used Latent Dirichlet Allocation (LDA) topic models, an unsupervised natural language processing (NLP) application, to generate themes about Spanish language tweets categorized by Spanish abortion identity labels: (1) proelección (pro-choice); (2) derecho a decidir (right to choose); (3) proaborto (pro-abortion); (4) provida (pro-life); (5) antiaborto (anti-abortion); and (6) derecho a vivir (right to life). We manually reviewed themes for each identity label to assess scope. RESULTS All six identity labels included in our analysis contained some references to abortion. However, several labels were not exclusive to abortion. Proelección (pro-choice), for example, contained several themes related to ongoing presidential elections. DISCUSSION AND CONCLUSION No singular Spanish abortion identity label encapsulates abortion beliefs; however, there are several viable options. Just as the debate remains ongoing about pro-choice and pro-life as accurate indicators of abortion beliefs in English, we must also consider that identity is more complex than binary labels in Spanish.
Collapse
Affiliation(s)
- Danny Valdez
- Department of Applied Health ScienceIndiana University School of Public HealthBloomingtonIndianaUSA
| | - Kristen N. Jozkowski
- Department of Applied Health ScienceIndiana University School of Public HealthBloomingtonIndianaUSA
| | - María S. Montenegro
- Department of Spanish and Portuguese StudiesIndiana UniversityBloomingtonIndianaUSA
| | - Brandon L. Crawford
- Department of Applied Health ScienceIndiana University School of Public HealthBloomingtonIndianaUSA
| | - Frederica Jackson
- Department of Applied Health ScienceIndiana University School of Public HealthBloomingtonIndianaUSA
| |
Collapse
|
36
|
Gangwar SS, Rathore SS, Chouhan SS, Soni S. Predictive modeling for suspicious content identification on Twitter. SOCIAL NETWORK ANALYSIS AND MINING 2022; 12:149. [PMID: 36217359 PMCID: PMC9534460 DOI: 10.1007/s13278-022-00977-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 08/24/2022] [Accepted: 09/17/2022] [Indexed: 11/26/2022]
Abstract
The wide popularity of Twitter as a medium of exchanging activities, entertainment, and information is attracted spammers to discover it as a stage to spam clients and spread misinformation. It poses the challenge to the researchers to identify malicious content and user profiles over Twitter such that timely action can be taken. Many previous works have used different strategies to overcome this challenge and combat spammer activities on Twitter. In this work, we develop various models that utilize different features such as profile-based features, content-based features, and hybrid features to identify malicious content and classify it as spam or not-spam. In the first step, we collect and label a large dataset from Twitter to create a spam detection corpus. Then, we create a set of rich features by extracting various features from the collected dataset. Further, we apply different machine learning, ensemble, and deep learning techniques to build the prediction models. We performed a comprehensive evaluation of different techniques over the collected dataset and assessed the performance for accuracy, precision, recall, and f1-score measures. The results showed that the used different sets of learning techniques have achieved a higher performance for the tweet spam classification. In most cases, the values are above 90% for different performance measures. These results show that using profile, content, user, and hybrid features for suspicious tweets detection helps build better prediction models.
Collapse
|
37
|
Takats C, Kwan A, Wormer R, Goldman D, Jones HE, Romero D. Ethical and Methodological Considerations of Twitter Data for Public Health Research: Systematic Review. J Med Internet Res 2022; 24:e40380. [PMID: 36445739 PMCID: PMC9748795 DOI: 10.2196/40380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 11/08/2022] [Accepted: 11/13/2022] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND Much research is being carried out using publicly available Twitter data in the field of public health, but the types of research questions that these data are being used to answer and the extent to which these projects require ethical oversight are not clear. OBJECTIVE This review describes the current state of public health research using Twitter data in terms of methods and research questions, geographic focus, and ethical considerations including obtaining informed consent from Twitter handlers. METHODS We implemented a systematic review, following PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, of articles published between January 2006 and October 31, 2019, using Twitter data in secondary analyses for public health research, which were found using standardized search criteria on SocINDEX, PsycINFO, and PubMed. Studies were excluded when using Twitter for primary data collection, such as for study recruitment or as part of a dissemination intervention. RESULTS We identified 367 articles that met eligibility criteria. Infectious disease (n=80, 22%) and substance use (n=66, 18%) were the most common topics for these studies, and sentiment mining (n=227, 62%), surveillance (n=224, 61%), and thematic exploration (n=217, 59%) were the most common methodologies employed. Approximately one-third of articles had a global or worldwide geographic focus; another one-third focused on the United States. The majority (n=222, 60%) of articles used a native Twitter application programming interface, and a significant amount of the remainder (n=102, 28%) used a third-party application programming interface. Only one-third (n=119, 32%) of studies sought ethical approval from an institutional review board, while 17% of them (n=62) included identifying information on Twitter users or tweets and 36% of them (n=131) attempted to anonymize identifiers. Most studies (n=272, 79%) included a discussion on the validity of the measures and reliability of coding (70% for interreliability of human coding and 70% for computer algorithm checks), but less attention was paid to the sampling frame, and what underlying population the sample represented. CONCLUSIONS Twitter data may be useful in public health research, given its access to publicly available information. However, studies should exercise greater caution in considering the data sources, accession method, and external validity of the sampling frame. Further, an ethical framework is necessary to help guide future research in this area, especially when individual, identifiable Twitter users and tweets are shared and discussed. TRIAL REGISTRATION PROSPERO CRD42020148170; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=148170.
Collapse
Affiliation(s)
- Courtney Takats
- City University of New York School of Public Health, New York City, NY, United States
| | - Amy Kwan
- City University of New York School of Public Health, New York City, NY, United States
| | - Rachel Wormer
- City University of New York School of Public Health, New York City, NY, United States
| | - Dari Goldman
- City University of New York School of Public Health, New York City, NY, United States
| | - Heidi E Jones
- City University of New York School of Public Health, New York City, NY, United States
| | - Diana Romero
- City University of New York School of Public Health, New York City, NY, United States
| |
Collapse
|
38
|
Culp F, Wu Y, Wu D, Ren Y, Raynor P, Hung P, Qiao S, Li X, Eichelberger K. Understanding Alcohol Use Discourse and Stigma Patterns in Perinatal Care on Twitter. Healthcare (Basel) 2022; 10:2375. [PMID: 36553899 PMCID: PMC9778089 DOI: 10.3390/healthcare10122375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2022] [Revised: 11/21/2022] [Accepted: 11/24/2022] [Indexed: 11/29/2022] Open
Abstract
(1) Background: perinatal alcohol use generates a variety of health risks. Social media platforms discuss fetal alcohol spectrum disorder (FASD) and other widespread outcomes, providing personalized user-generated content about the perceptions and behaviors related to alcohol use during pregnancy. Data collected from Twitter underscores various narrative structures and sentiments in tweets that reflect large-scale discourses and foster societal stigmas; (2) Methods: We extracted alcohol-related tweets from May 2019 to October 2021 using an official Twitter search API based on a set of keywords provided by our clinical team. Our exploratory study utilized thematic content analysis and inductive qualitative coding methods to analyze user content. Iterative line-by-line coding categorized dynamic descriptive themes from a random sample of 500 tweets; (3) Results: qualitative methods from content analysis revealed underlying patterns among inter-user engagements, outlining individual, interpersonal and population-level stigmas about perinatal alcohol use and negative sentiment towards drinking mothers. As a result, the overall silence surrounding personal experiences with alcohol use during pregnancy suggests an unwillingness and sense of reluctancy from pregnant adults to leverage the platform for support and assistance due to societal stigmas; (4) Conclusions: identifying these discursive factors will facilitate more effective public health programs that take into account specific challenges related to social media networks and develop prevention strategies to help Twitter users struggling with perinatal alcohol use.
Collapse
Affiliation(s)
- Fritz Culp
- College of Engineering and Computing, University of South Carolina, Columbia, SC 29208, USA
| | - Yuqi Wu
- Arnold School of Public Health, University of South Carolina, Columbia, SC 29208, USA
| | - Dezhi Wu
- College of Engineering and Computing, University of South Carolina, Columbia, SC 29208, USA
| | - Yang Ren
- College of Engineering and Computing, University of South Carolina, Columbia, SC 29208, USA
| | - Phyllis Raynor
- College of Nursing, University of South Carolina, Columbia, SC 29208, USA
| | - Peiyin Hung
- Arnold School of Public Health, University of South Carolina, Columbia, SC 29208, USA
| | - Shan Qiao
- Arnold School of Public Health, University of South Carolina, Columbia, SC 29208, USA
| | - Xiaoming Li
- Arnold School of Public Health, University of South Carolina, Columbia, SC 29208, USA
| | - Kacey Eichelberger
- Prisma Health Upstate, University of South Carolina School of Medicine Greenville, Greensville, SC 29605, USA
| |
Collapse
|
39
|
Improving Public Health Policy by Comparing the Public Response during the Start of COVID-19 and Monkeypox on Twitter in Germany: A Mixed Methods Study. Vaccines (Basel) 2022; 10:vaccines10121985. [PMID: 36560395 PMCID: PMC9787903 DOI: 10.3390/vaccines10121985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 11/06/2022] [Accepted: 11/17/2022] [Indexed: 11/24/2022] Open
Abstract
Little is known about monkeypox public concerns since its widespread emergence in many countries. Tweets in Germany were examined in the first three months of COVID-19 and monkeypox to examine concerns and issues raised by the public. Understanding views and positions of the public could help to shape future public health campaigns. Few qualitative studies reviewed large datasets, and the results provide the first instance of the public thinking comparing COVID-19 and monkeypox. We retrieved 15,936 tweets from Germany using query words related to both epidemics in the first three months of each one. A sequential explanatory mixed methods research joined a machine learning approach with thematic analysis using a novel rapid tweet analysis protocol. In COVID-19 tweets, there was the selfing construct or feeling part of the emerging narrative of the spread and response. In contrast, during monkeypox, the public considered othering after the fatigue of the COVID-19 response, or an impersonal feeling toward the disease. During monkeypox, coherence and reconceptualization of new and competing information produced a customer rather than a consumer/producer model. Public healthcare policy should reconsider a one-size-fits-all model during information campaigns and produce a strategic approach embedded within a customer model to educate the public about preventative measures and updates. A multidisciplinary approach could prevent and minimize mis/disinformation.
Collapse
|
40
|
Discussions About COVID-19 Vaccination on Twitter in Turkey: Sentiment Analysis. Disaster Med Public Health Prep 2022; 17:e266. [PMID: 36226686 DOI: 10.1017/dmp.2022.229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
OBJECTIVES The present study aims to examine coronavirus disease 2019 (COVID-19) vaccination discussions on Twitter in Turkey and conduct sentiment analysis. METHODS The current study performed sentiment analysis of Twitter data with the artificial intelligence (AI) Natural Language Processing (NLP) method. The tweets were retrieved retrospectively from March 10, 2020, when the first COVID-19 case was seen in Turkey, to April 18, 2022. A total of 10,308 tweets accessed. The data were filtered before analysis due to excessive noise. First, the text is tokenized. Many steps were applied in normalizing texts. Tweets about the COVID-19 vaccines were classified according to basic emotion categories using sentiment analysis. The resulting dataset was used for training and testing ML (ML) classifiers. RESULTS It was determined that 7.50% of the tweeters had positive, 0.59% negative, and 91.91% neutral opinions about the COVID-19 vaccination. When the accuracy values of the ML algorithms used in this study were examined, it was seen that the XGBoost (XGB) algorithm had higher scores. CONCLUSIONS Three of 4 tweets consist of negative and neutral emotions. The responsibility of professional chambers and the public is essential in transforming these neutral and negative feelings into positive ones.
Collapse
|
41
|
Tokac U, Brysiewicz P, Chipps J. Public perceptions on Twitter of nurses during the COVID-19 pandemic. Contemp Nurse 2022; 58:414-423. [PMID: 36370034 DOI: 10.1080/10376178.2022.2147850] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
BACKGROUND The use of social media platforms to convey public opinions and attitudes has exponentially increased over the last decade on topics related to health. In all these social media postings related to the pandemic, specific attention has been focused on healthcare professionals, specifically nurses. OBJECTIVE This study aimed to explore how the keyword 'nurse' is located in COVID-19 pandemic-related tweets during a selected period of the pandemic in order to assess public perception. METHODS Tweets related to COVID-19 were downloaded from Twitter for the period January 1st, 2020, to November 11th, 2021. Sentiment analysis was used to identify opinions, emotions, and approaches expressed in tweet which included 'nurse', 'COVID-19', and 'pandemic' as either keyword or hashtags. RESULTS A total of 2,440,696 most used unique words in the downloaded 582,399 tweets were included and the sentiment analysis indicated that 24.4% (n = 595,530) of the tweets demonstrated positive sentiment while 14.1% (n = 343,433) of the tweets demonstrated negative sentiment during COVID-19. Within these results, 17% (n = 416,366) of the tweets included positive basic emotion words of trust and 4.9% (n = 120,654) of joy. In terms of negative basic emotion words, 9.9% (n = 241,758) of the tweets included the word fear, 8.3% (n = 202,179) anticipation, 7.9% (n = 193,145) sadness, 5.7% (n = 139,791) anger, 4.2% (n = 103,936) disgust, and 3.6% (n = 88,338) of the tweets included the word surprised. CONCLUSIONS It is encouraging to note that with the advent of major health crises, public perceptions on social media, appears to portray an image of nurses which reflects the professionalism and values of the profession.
Collapse
Affiliation(s)
- Umit Tokac
- UMSL College of Nursing, University of Missouri, One University Boulevard, St Louis, MO 63131-4400, USA
| | - Petra Brysiewicz
- Discipline of Nursing, School of Nursing and Public Health, University of KwaZulu-Natal, Mazisi Kunene Road, Glenwood, Durban, 4041, South Africa
| | - Jennifer Chipps
- School of Nursing, Faculty of Community Health Sciences, University of the Western Cape, 14 Blanckenberg Road, Belville, Capetown, South Africa
| |
Collapse
|
42
|
Belt RV, Rahimi K, Cai S. Researching the hard-to-reach: a scoping review protocol of digital health research in hidden, marginal and excluded populations. BMJ Open 2022; 12:e061361. [PMID: 36171043 PMCID: PMC9528575 DOI: 10.1136/bmjopen-2022-061361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Accepted: 08/30/2022] [Indexed: 11/04/2022] Open
Abstract
INTRODUCTION There is a significant growth in the use of digital technology and methods in health-related research, further driven by the COVID-19 pandemic. This has offered a potential to apply digital health research in hidden, marginalised and excluded populations who are traditionally not easily reached due to economic, societal and legal barriers. To better inform future digital health studies of these vulnerable populations, we proposed a scoping review to comprehensively map published evidence and guidelines on the applications and challenges of digital health research methods to hard-to-reach communities. METHODS AND ANALYSIS This review will follow the Arksey and O' Malley methodological framework for scoping reviews. The framework for the review will employ updated methods developed by the Joanna Briggs Institute including the Preferred Reporting Items for Systematic reviews and Meta-Analysis Scoping Review checklist. PubMed, the Cochrane Library, PsycINFO, Google Scholar and Greenfile are the identified databases for peer-reviewed quantitative and qualitative studies in-scope of the review. Grey literature focused on guidance and best practice in digital health research, and hard-to-reach populations will also be searched following published protocols. The review will focus on literature published between 1 February 2012 and 1 February 2022. Two reviewers are engaged in the review. After screening the title and abstract to determine the eligibility of each article, a thorough full-text review of eligible articles will be conducted using a data extraction framework. Key extracted information will be mapped in tabular and visualised summaries to categorise the breadth of literature and identify key digital methods, including their limitations and potential, for use in hard-to-reach populations. ETHICS AND DISSEMINATION This scoping review does not require ethical approval. The results of the scoping review will consist of peer-reviewed publications, presentations and knowledge mobilisation activities including a lay summary posted via social media channels and production of a policy brief.
Collapse
Affiliation(s)
- Rachel Victoria Belt
- Nuffield Department of Women's & Reproductive Health, University of Oxford, Oxford, UK
| | - Kazem Rahimi
- Nuffield Department of Women's & Reproductive Health, University of Oxford, Oxford, UK
| | - Samuel Cai
- Centre for Environmental Health and Sustainability, Department of Health Sciences, University of Leicester, Leicester, Leicestershire, UK
| |
Collapse
|
43
|
Zeng Z, Deng Q, Liu W. Knowledge sharing of health technology among clinicians in integrated care system: The role of social networks. Front Psychol 2022; 13:926736. [PMID: 36237697 PMCID: PMC9553305 DOI: 10.3389/fpsyg.2022.926736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Accepted: 09/08/2022] [Indexed: 02/05/2023] Open
Abstract
Promoting clinicians' knowledge sharing of appropriate health technology within the integrated care system (ICS) is of great vitality in bridging the technological gap between member institutions. However, the role of social networks in knowledge sharing of health technology is still largely unknown. To address this issue, the study aims to clarify the influence of clinicians' social networks on knowledge sharing of health technology within the ICS. A questionnaire survey was conducted among the clinicians in the Alliance of Liver Disease Specialists in Fujian Province, China. Social network analysis was conducted using NetDraw and UCINET, and the quadratic assignment procedure (QAP) multiple regression was used to analyze the influencing factors of knowledge sharing of health technology. The results showed that the ICS played an insufficient role in promoting overall knowledge sharing, especially inter-institutional knowledge sharing. Trust, emotional support, material support, and cognitive proximity positively influenced knowledge sharing of health technology, while the frequency of interaction and relationship importance had a negative impact on it. The finding extended the research scope of social network theory to the field of healthcare and will bridge the evidence gap in the influence of the clinicians' social networks on their knowledge sharing within the ICS, providing new ideas to boost knowledge sharing and diffusion of appropriate health technology.
Collapse
|
44
|
Ji H, Wang J, Meng B, Cao Z, Yang T, Zhi G, Chen S, Wang S, Zhang J. Research on adaption to air pollution in Chinese cities: Evidence from social media-based health sensing. ENVIRONMENTAL RESEARCH 2022; 210:112762. [PMID: 35065934 DOI: 10.1016/j.envres.2022.112762] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 12/13/2021] [Accepted: 01/16/2022] [Indexed: 06/14/2023]
Abstract
Air pollution seriously threats to human health. Understanding the health effects of air pollution is of great importance for developing countermeasures. However, little is known about the real-time impacts of air pollution on the human heath in a comprehensive way in developing nations, like China. To fill this research gap, the Chinese urbanites' health were sensed from more than 210.82 million Weibo (Chinese Twitter) data in 2017. The association between air pollution and the health sensing were quantified through generalized additive models, based on which the sensitivities and adaptions to air pollution in 70 China's cities were assessed. The results documented that the Weibo data can well sense urbanites' health in real time. With the different geographical characteristics and socio-economic conditions, the Chinese residents have adaption to air pollution, indicated by the spatial heterogeneity of the sensitivities to air pollution. Cities with good air quality in South China and East China were more sensitive to air pollution, while cities with worse air quality in Northwest China and North China were less sensitive. This research provides a new perspective and methodologies for health sensing and the health effect of air pollution.
Collapse
Affiliation(s)
- Huimin Ji
- College of Applied Arts and Sciences, Beijing Union University, Beijing, 100191, China; Laboratory of Urban Cultural Sensing & Computing, Beijing Union University, Beijing, 100191, China
| | - Juan Wang
- College of Applied Arts and Sciences, Beijing Union University, Beijing, 100191, China; Laboratory of Urban Cultural Sensing & Computing, Beijing Union University, Beijing, 100191, China.
| | - Bin Meng
- College of Applied Arts and Sciences, Beijing Union University, Beijing, 100191, China; Laboratory of Urban Cultural Sensing & Computing, Beijing Union University, Beijing, 100191, China
| | - Zheng Cao
- School of Geographical Sciences, Guangzhou University, Guangzhou, 510006, China
| | - Tong Yang
- College of Applied Arts and Sciences, Beijing Union University, Beijing, 100191, China; Laboratory of Urban Cultural Sensing & Computing, Beijing Union University, Beijing, 100191, China
| | - Guoqing Zhi
- College of Applied Arts and Sciences, Beijing Union University, Beijing, 100191, China; Laboratory of Urban Cultural Sensing & Computing, Beijing Union University, Beijing, 100191, China
| | - Siyu Chen
- College of Applied Arts and Sciences, Beijing Union University, Beijing, 100191, China; Laboratory of Urban Cultural Sensing & Computing, Beijing Union University, Beijing, 100191, China
| | - Shaohua Wang
- Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, 100094, China
| | - Jingqiu Zhang
- College of Applied Arts and Sciences, Beijing Union University, Beijing, 100191, China
| |
Collapse
|
45
|
León-Sandoval E, Zareei M, Barbosa-Santillán LI, Falcón Morales LE, Pareja Lora A, Ochoa Ruiz G. Monitoring the Emotional Response to the COVID-19 Pandemic Using Sentiment Analysis: A Case Study in Mexico. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:4914665. [PMID: 35634092 PMCID: PMC9132622 DOI: 10.1155/2022/4914665] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 03/07/2022] [Accepted: 03/31/2022] [Indexed: 11/17/2022]
Abstract
The world is facing the COVID-19 pandemic, leading to an unprecedented change in the lifestyle routines of millions. Beyond the general physical health, financial, and social repercussions of the pandemic, the adopted mitigation measures also present significant challenges in the population's mental health and health programs. It is complex for public organizations to measure the population's mental health in order to incorporate it into their own decision-making process. Traditional survey methods are time-consuming, expensive, and fail to provide the continuous information needed to respond to the rapidly evolving effects of governmental policies on the population's mental health. A significant portion of the population has turned to social media to express the details of their daily life, rendering this public data a rich field for understanding emotional and mental well-being. This study aims to track and measure the sentiment changes of the Mexican population in response to the COVID-19 pandemic. To this end, we analyzed 760,064,879 public domain tweets collected from a public access repository to examine the collective shifts in the general mood about the pandemic evolution, news cycles, and governmental policies using open sentiment analysis tools. Sentiment analysis polarity scores, which oscillate around -0.15, show a weekly seasonality according to Twitter's usage and a consistently negative outlook from the population. It also remarks on the increased controversy after the governmental decision to terminate the lockdown and the celebrated holidays, which encouraged the people to incur social gatherings. These findings expose the adverse emotional effects of the ongoing pandemic while showing an increase in social media usage rates of 2.38 times, which users employ as a coping mechanism to mitigate the feelings of isolation related to long-term social distancing. The findings have important implications in the mental health infrastructure for ongoing mitigation efforts and feedback on the perception of policies and other measures. The overall trend of the sentiment polarity is 0.0001110643.
Collapse
Affiliation(s)
- Edgar León-Sandoval
- School of Engineering and Sciences, Monterrey Institute of Technology and Higher Education, Monterrey, Mexico
| | - Mahdi Zareei
- School of Engineering and Sciences, Monterrey Institute of Technology and Higher Education, Monterrey, Mexico
| | | | - Luis Eduardo Falcón Morales
- School of Engineering and Sciences, Monterrey Institute of Technology and Higher Education, Monterrey, Mexico
| | | | - Gilberto Ochoa Ruiz
- School of Engineering and Sciences, Monterrey Institute of Technology and Higher Education, Monterrey, Mexico
| |
Collapse
|
46
|
Walsh J, Dwumfour C, Cave J, Griffiths F. Spontaneously generated online patient experience data - how and why is it being used in health research: an umbrella scoping review. BMC Med Res Methodol 2022; 22:139. [PMID: 35562661 PMCID: PMC9106384 DOI: 10.1186/s12874-022-01610-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Accepted: 04/13/2022] [Indexed: 11/10/2022] Open
Abstract
PURPOSE Social media has led to fundamental changes in the way that people look for and share health related information. There is increasing interest in using this spontaneously generated patient experience data as a data source for health research. The aim was to summarise the state of the art regarding how and why SGOPE data has been used in health research. We determined the sites and platforms used as data sources, the purposes of the studies, the tools and methods being used, and any identified research gaps. METHODS A scoping umbrella review was conducted looking at review papers from 2015 to Jan 2021 that studied the use of SGOPE data for health research. Using keyword searches we identified 1759 papers from which we included 58 relevant studies in our review. RESULTS Data was used from many individual general or health specific platforms, although Twitter was the most widely used data source. The most frequent purposes were surveillance based, tracking infectious disease, adverse event identification and mental health triaging. Despite the developments in machine learning the reviews included lots of small qualitative studies. Most NLP used supervised methods for sentiment analysis and classification. Very early days, methods need development. Methods not being explained. Disciplinary differences - accuracy tweaks vs application. There is little evidence of any work that either compares the results in both methods on the same data set or brings the ideas together. CONCLUSION Tools, methods, and techniques are still at an early stage of development, but strong consensus exists that this data source will become very important to patient centred health research.
Collapse
Affiliation(s)
- Julia Walsh
- Warwick Medical School, University of Warwick, Coventry, UK.
| | | | - Jonathan Cave
- Department of Economics, University of Warwick, Coventry, UK
| | - Frances Griffiths
- Warwick Medical School, University of Warwick, Coventry, UK.,Centre for Health Policy, University of the Witwatersrand, Johannesburg, South Africa
| |
Collapse
|
47
|
Tahir B, Mehmood MA. Anbar: Collection and analysis of a large scale Urdu language Twitter corpus. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-219266] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
The confluence of high performance computing algorithms and large scale high-quality data has led to the availability of cutting edge tools in computational linguistics. However, these state-of-the-art tools are available only for the major languages of the world. The preparation of large scale high-quality corpora for low-resource language such as Urdu is a challenging task as it requires huge computational and human resources. In this paper, we build and analyze a large scale Urdu language Twitter corpus Anbar. For this purpose, we collect 106.9 million Urdu tweets posted by 1.69 million users during one year (September 2018-August 2019). Our corpus consists of tweets with a rich vocabulary of 3.8 million unique tokens along with 58K hashtags and 62K URLs. Moreover, it contains 75.9 million (71.0%) retweets and 847K geotagged tweets. Furthermore, we examine Anbar using a variety of metrics like temporal frequency of tweets, vocabulary size, geo-location, user characteristics, and entities distribution. To the best of our knowledge, this is the largest repository of Urdu language tweets for the NLP research community which can be used for Natural Language Understanding (NLU), social analytics, and fake news detection.
Collapse
Affiliation(s)
- Bilal Tahir
- Al-Khawarizmi Institute of Computer Science, University of Engineering and Technology, Lahore, Pakistan
| | - Muhammad Amir Mehmood
- Al-Khawarizmi Institute of Computer Science, University of Engineering and Technology, Lahore, Pakistan
| |
Collapse
|
48
|
Hasan A, Levene M, Weston D, Fromson R, Koslover N, Levene T. Monitoring Covid-19 on social media using a novel triage and diagnosis approach. J Med Internet Res 2022; 24:e30397. [PMID: 35142636 PMCID: PMC8887561 DOI: 10.2196/30397] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 07/09/2021] [Accepted: 02/05/2022] [Indexed: 12/23/2022] Open
Abstract
Background The COVID-19 pandemic has created a pressing need for integrating information from disparate sources in order to assist decision makers. Social media is important in this respect; however, to make sense of the textual information it provides and be able to automate the processing of large amounts of data, natural language processing methods are needed. Social media posts are often noisy, yet they may provide valuable insights regarding the severity and prevalence of the disease in the population. Here, we adopt a triage and diagnosis approach to analyzing social media posts using machine learning techniques for the purpose of disease detection and surveillance. We thus obtain useful prevalence and incidence statistics to identify disease symptoms and their severities, motivated by public health concerns. Objective This study aims to develop an end-to-end natural language processing pipeline for triage and diagnosis of COVID-19 from patient-authored social media posts in order to provide researchers and public health practitioners with additional information on the symptoms, severity, and prevalence of the disease rather than to provide an actionable decision at the individual level. Methods The text processing pipeline first extracted COVID-19 symptoms and related concepts, such as severity, duration, negations, and body parts, from patients’ posts using conditional random fields. An unsupervised rule-based algorithm was then applied to establish relations between concepts in the next step of the pipeline. The extracted concepts and relations were subsequently used to construct 2 different vector representations of each post. These vectors were separately applied to build support vector machine learning models to triage patients into 3 categories and diagnose them for COVID-19. Results We reported macro- and microaveraged F1 scores in the range of 71%-96% and 61%-87%, respectively, for the triage and diagnosis of COVID-19 when the models were trained on human-labeled data. Our experimental results indicated that similar performance can be achieved when the models are trained using predicted labels from concept extraction and rule-based classifiers, thus yielding end-to-end machine learning. In addition, we highlighted important features uncovered by our diagnostic machine learning models and compared them with the most frequent symptoms revealed in another COVID-19 data set. In particular, we found that the most important features are not always the most frequent ones. Conclusions Our preliminary results show that it is possible to automatically triage and diagnose patients for COVID-19 from social media natural language narratives, using a machine learning pipeline in order to provide information on the severity and prevalence of the disease for use within health surveillance systems.
Collapse
Affiliation(s)
- Abul Hasan
- Birkbeck, University of London, Malet street, bloomsbury, London, GB
| | - Mark Levene
- Birkbeck, University of London, Malet street, bloomsbury, London, GB
| | - David Weston
- Birkbeck, University of London, Malet street, bloomsbury, London, GB
| | - Renate Fromson
- Barnet General Hospital, Wellhouse Lane, London EN5 3DJ, United Kingdom, London, GB
| | - Nicolas Koslover
- Barnet General Hospital, Wellhouse Lane, London EN5 3DJ, United Kingdom, London, GB
| | - Tamara Levene
- Barnet General Hospital, Wellhouse Lane, London EN5 3DJ, United Kingdom, London, GB
| |
Collapse
|
49
|
Sahu KS, Majowicz SE, Dubin JA, Morita PP. NextGen Public Health Surveillance and the Internet of Things (IoT). Front Public Health 2021; 9:756675. [PMID: 34926381 PMCID: PMC8678116 DOI: 10.3389/fpubh.2021.756675] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Accepted: 11/12/2021] [Indexed: 11/23/2022] Open
Abstract
Recent advances in technology have led to the rise of new-age data sources (e.g., Internet of Things (IoT), wearables, social media, and mobile health). IoT is becoming ubiquitous, and data generation is accelerating globally. Other health research domains have used IoT as a data source, but its potential has not been thoroughly explored and utilized systematically in public health surveillance. This article summarizes the existing literature on the use of IoT as a data source for surveillance. It presents the shortcomings of current data sources and how NextGen data sources, including the large-scale applications of IoT, can meet the needs of surveillance. The opportunities and challenges of using these modern data sources in public health surveillance are also explored. These IoT data ecosystems are being generated with minimal effort by the device users and benefit from high granularity, objectivity, and validity. Advances in computing are now bringing IoT-based surveillance into the realm of possibility. The potential advantages of IoT data include high-frequency, high volume, zero effort data collection methods, with a potential to have syndromic surveillance. In contrast, the critical challenges to mainstream this data source within surveillance systems are the huge volume and variety of data, fusing data from multiple devices to produce a unified result, and the lack of multidisciplinary professionals to understand the domain and analyze the domain data accordingly.
Collapse
Affiliation(s)
- Kirti Sundar Sahu
- School of Public Health Sciences, University of Waterloo, Waterloo, ON, Canada
| | - Shannon E. Majowicz
- School of Public Health Sciences, University of Waterloo, Waterloo, ON, Canada
| | - Joel A. Dubin
- School of Public Health Sciences, University of Waterloo, Waterloo, ON, Canada
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, ON, Canada
| | - Plinio Pelegrini Morita
- School of Public Health Sciences, University of Waterloo, Waterloo, ON, Canada
- Institute of Health Policy, Management, and Evaluation, University of Toronto, Toronto, ON, Canada
- Department of Systems Design Engineering, University of Waterloo, Waterloo, ON, Canada
- Ehealth Innovation, Techna Institute, University Health Network, Toronto, ON, Canada
- Research Institute for Aging, University of Waterloo, Waterloo, ON, Canada
| |
Collapse
|
50
|
Artificial neural networks applied for predicting and explaining the education level of Twitter users. SOCIAL NETWORK ANALYSIS AND MINING 2021; 11:112. [PMID: 34745380 PMCID: PMC8558764 DOI: 10.1007/s13278-021-00832-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Revised: 10/07/2021] [Accepted: 10/17/2021] [Indexed: 11/21/2022]
Abstract
This paper provides a novel procedure to estimate the education level of social network (SN) users by leveraging artificial neural networks (ANN). Additionally, it provides a robust methodology to extract explanatory insights from ANN models. It also contributes to the study of socio-demographic phenomena by utilizing less explored data sources, such as social media. It proposes Twitter data as an alternative data source for in-depth social studies, and ANN for complex patterns recognition. Moreover, cutting edge technology, such as face recognition, on social media data are applied to explain the social characteristics of country-specific users. We use nine variables and three hidden layers of neurons to identify high-skilled users. The resulted model describes well the level of education by correctly estimating it with an accuracy of 95% on the training set and an accuracy of 92% on a testing set. Approximately 30% of the analyzed users are highly skilled and this share does not differ among the two genders. However, it tends to be lower among users younger than 30 years old.
Collapse
|