1
|
Overall US Hospice Quality According to Decedent Caregivers-Natural Language Processing and Sentiment Analysis of 3389 Online Caregiver Reviews. Am J Hosp Palliat Care 2024; 41:527-544. [PMID: 37338245 DOI: 10.1177/10499091231185593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/21/2023] Open
Abstract
Objectives: With an untapped quality resource in online hospice reviews, study aims were exploring hospice caregiver experiences and assessing their expectations of the hospice Medicare benefit. Methods: Topical and sentiment analysis was conducted using natural language processing (NLP) of Google and Yelp caregiver reviews (n = 3393) between 2013-2023 using Google NLP. Stratified sampling weighted by hospice size to approximate the daily census of US hospice enrollees. Results: Overall caregiver sentiment of hospice care was neutral (S = .14). Therapeutic, achievable expectations and misperceptions, unachievable expectations were, respectively, the most and least prevalent domains. Four topics with the highest prevalence, all had moderately positive sentiments: caring staff, staff professionalism and knowledge; emotional, spiritual, bereavement support; and responsive, timely or helpful. Lowest sentiments scores were lack of staffing; promises made, but not kept, pain, symptoms and medications; sped-up death, hasted, or sedated; and money, staff motivations. Significance of Results: Caregivers overall rating of hospice was neutral, largely due to moderate sentiment on achievable expectations in two-thirds of reviews mixed with unachievable expectations in one-sixth of reviews. Hospice caregivers were most likely to recommend hospices with caring staff, providing quality care, responsive to requests, and offering family support. Lack of staff, inadequate pain-symptom management were the two biggest barriers to hospice quality. All eight CAHPS measures were found in the discovered review topics. Close-ended CAHPS scores and open-ended online reviews have complementary insights. Future research should explore associations between CAHPS and review insights.
Collapse
|
2
|
Does Pollyanna hypothesis hold true in death narratives? A sentiment analysis approach. Acta Psychol (Amst) 2024; 245:104238. [PMID: 38565066 DOI: 10.1016/j.actpsy.2024.104238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 03/24/2024] [Accepted: 03/28/2024] [Indexed: 04/04/2024] Open
Abstract
Pollyanna hypothesis claims that human beings have a universal tendency to use positive words more frequently and broadly than negative words. The present study aims to test Pollyanna hypothesis in medical death narratives at both lexical and text levels by using sentiment analysis and emotion detection methods, and to qualitatively analyze the contextual use of emotion words to deepen the understanding of doctors' emotions. Sentiment analysis showed a strong token-based linguistic positivity and a weak type-based negativity bias at the lexical level, and a general positivity bias at the text level, despite the gender of the doctors. Emotion detection discovered three prominent emotions of "joy", "sadness", and "anger", and a greater diversity of negative emotions in contrast to positive emotions in medical death narratives. Contextual analysis revealed that emotion words associated with joy were primarily observed in contexts related to doctors' actions and behaviors aiming to benefit others and promote social wellbeing. Emotion words associated with sadness and anger were chiefly employed to describe situations involving patients' death and doctors' attitudes towards death. The results confirm Pollyanna hypothesis at both token-based lexical level and text level and falsify the hypothesis at type-based lexical level. Possible explanations are explored by contextual analysis, and theoretical analysis from the perspectives of cognitive linguistics and social psychology. The findings are expected to enrich the understanding of Pollyanna hypothesis as well as the junior doctors' emotional responses to clinical deaths.
Collapse
|
3
|
CIDER: Context-sensitive polarity measurement for short-form text. PLoS One 2024; 19:e0299490. [PMID: 38635650 PMCID: PMC11025856 DOI: 10.1371/journal.pone.0299490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 02/11/2024] [Indexed: 04/20/2024] Open
Abstract
Researchers commonly perform sentiment analysis on large collections of short texts like tweets, Reddit posts or newspaper headlines that are all focused on a specific topic, theme or event. Usually, general-purpose sentiment analysis methods are used. These perform well on average but miss the variation in meaning that happens across different contexts, for example, the word "active" has a very different intention and valence in the phrase "active lifestyle" versus "active volcano". This work presents a new approach, CIDER (Context Informed Dictionary and sEmantic Reasoner), which performs context-sensitive linguistic analysis, where the valence of sentiment-laden terms is inferred from the whole corpus before being used to score the individual texts. In this paper, we detail the CIDER algorithm and demonstrate that it outperforms state-of-the-art generalist unsupervised sentiment analysis techniques on a large collection of tweets about the weather. CIDER is also applicable to alternative (non-sentiment) linguistic scales. A case study on gender in the UK is presented, with the identification of highly gendered and sentiment-laden days. We have made our implementation of CIDER available as a Python package: https://pypi.org/project/ciderpolarity/.
Collapse
|
4
|
Sentiment analysis of Indonesian tweets on COVID-19 and COVID-19 vaccinations. F1000Res 2024; 12:1007. [PMID: 38605817 PMCID: PMC11007366 DOI: 10.12688/f1000research.130610.3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 04/12/2024] [Indexed: 04/13/2024] Open
Abstract
Background Sentiments and opinions regarding COVID-19 and the COVID-19 vaccination on Indonesian-language Twitter are scarcely reported in one comprehensive study, and thus were aimed at our study. We also analyzed fake news and facts, and Twitter engagement to understand people's perceptions and beliefs that determine public health literacy. Methods We collected 3,489,367 tweets data from January 2020 to August 2021. We analyzed factual and fake news using the string comparison method. The difflib library was used to measure similarity. The user's engagement was analyzed by averaging the engagement metrics of tweets, retweets, favorites, replies, and posts shared with sentiments and opinions regarding COVID-19 and COVID-19 vaccination. Result Positive sentiments on COVID-19 and COVID-19 vaccination dominated, however, the negative sentiments increased during the beginning of the implementation of restrictions on community activities (PPKM). The tweets were dominated by the importance of health protocols (washing hands, keeping distance, and wearing masks). Several types of vaccines were on top of the word count in the vaccine subtopic. Acceptance of the vaccination increased during the studied period, and the fake news was overweighed by the facts. The tweets were dynamic and showed that the engaged topics were changed from the nature of COVID-19 to the vaccination and virus mutation which peaked in the early and middle terms of 2021. The public sentiment and engagement were shifted from hesitancy to anxiety towards the safety and effectiveness of the vaccines, whilst changed again into wariness on an uprising of the delta variant. Conclusion Understanding public sentiment and opinion can help policymakers to plan the best strategy to cope with the pandemic. Positive sentiments and fact-based opinions on COVID-19, and COVID-19 vaccination had been shown predominantly. However, sufficient health literacy levels could yet be predicted and sought for further study.
Collapse
|
5
|
Areas of interest and sentiment analysis towards second generation antipsychotics, lithium and mood stabilizing anticonvulsants: Unsupervised analysis using Twitter. J Affect Disord 2024; 351:649-660. [PMID: 38290587 DOI: 10.1016/j.jad.2024.01.234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 01/23/2024] [Accepted: 01/26/2024] [Indexed: 02/01/2024]
Abstract
BACKGROUND Severe mental disorders like Schizophrenia and related psychotic disorders (SRD) or Bipolar Disorder (BD) require pharmacological treatment for relapse prevention and quality of life improvement. Yet, treatment adherence is a challenge, partly due to patients' attitudes and beliefs towards their medication. Social media listening offers insights into patient experiences and preferences, particularly in severe mental disorders. METHODS All tweets posted between 2008 and 2022 mentioning the names of the main drugs used in SRD and BD were analyzed using advanced artificial intelligence techniques such as machine learning, and deep learning, along with natural language processing. RESULTS In this 15-year study analyzing 893,289 tweets, second generation antipsychotics received more mentions in English tweets, whereas mood stabilizers received more tweets in Spanish. English tweets about economic and legal aspects displayed negative emotions, while Spanish tweets seeking advice showed surprise. Moreover, a recurring theme in Spanish tweets was the shortage of medications, evoking feelings of anger among users. LIMITATIONS This study's analysis of Twitter data, while insightful, may not fully capture the nuances of discussions due to the platform's brevity. Additionally, the wide therapeutic use of the studied drugs, complicates the isolation of disorder-specific discourse. Only English and Spanish tweets were examined, limiting the cultural breadth of the findings. CONCLUSION This study emphasizes the importance of social media research in understanding user perceptions of SRD and BD treatments. The results provide valuable insights for clinicians when considering how patients and the general public view and communicate about these treatments in the digital environment.
Collapse
|
6
|
Opinion Mining by Convolutional Neural Networks for Maximizing Discoverability of Nanomaterials. J Chem Inf Model 2024; 64:2746-2759. [PMID: 37982753 DOI: 10.1021/acs.jcim.3c00746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2023]
Abstract
The scientific literature contains valuable information that can be used for future applications, but manual analysis presents challenges due to its size and disciplinary boundaries. The prevailing solution involves natural language processing (NLP) techniques such as information retrieval. Nonetheless, existing automated systems primarily provide either statistically based shallow information or deep information without traceability, thereby falling short of delivering high-quality and reliable insights. To address this, we propose an innovative approach of leveraging sentiment information embedded within the literature to track the opinions toward materials. In this study, we integrated material knowledge into text representation and constructed opinion data sets to hierarchically train deep learning models, named as Scientific Sentiment Network (SSNet). SSNet can effectively extract knowledge from the energy material literature and accurately categorize expert opinions into challenges and opportunities (94% and 92% accuracy, respectively). By incorporating sentiment features determined by SSNet, we can predict the ranking of emerging thermoelectric materials with a 70% correlation to experimental outcomes. Furthermore, our model achieves a commendable 68% accuracy in predicting suitable nanomaterials for atomic layer deposition (ALD) over time. These promising results offer a practical framework to extract and synthesize knowledge from the scientific literature, thereby accelerating research in the field of nanomaterials.
Collapse
|
7
|
Sentiment analysis using averaged weighted word vector features. PLoS One 2024; 19:e0299264. [PMID: 38573946 PMCID: PMC10994307 DOI: 10.1371/journal.pone.0299264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Accepted: 02/06/2024] [Indexed: 04/06/2024] Open
Abstract
People use the World Wide Web heavily to share their experiences with entities such as products, services or travel destinations. Texts that provide online feedback through reviews and comments are essential for consumer decisions. These comments create a valuable source that may be used to measure satisfaction related to products or services. Sentiment analysis is the task of identifying opinions expressed in such text fragments. In this work, we develop two methods that combine different types of word vectors to learn and estimate the polarity of reviews. We create average review vectors from word vectors and add weights to these review vectors using word frequencies in positive and negative sensitivity-tagged reviews. We applied the methods to several datasets from different domains used as standard sentiment analysis benchmarks. We ensemble the techniques with each other and existing methods, and we compare them with the approaches in the literature. The results show that the performances of our approaches outperform the state-of-the-art success rates.
Collapse
|
8
|
Domain adaptive learning for multi realm sentiment classification on big data. PLoS One 2024; 19:e0297028. [PMID: 38557742 PMCID: PMC10984522 DOI: 10.1371/journal.pone.0297028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 12/25/2023] [Indexed: 04/04/2024] Open
Abstract
Machine learning techniques that rely on textual features or sentiment lexicons can lead to erroneous sentiment analysis. These techniques are especially vulnerable to domain-related difficulties, especially when dealing in Big data. In addition, labeling is time-consuming and supervised machine learning algorithms often lack labeled data. Transfer learning can help save time and obtain high performance with fewer datasets in this field. To cope this, we used a transfer learning-based Multi-Domain Sentiment Classification (MDSC) technique. We are able to identify the sentiment polarity of text in a target domain that is unlabeled by looking at reviews in a labelled source domain. This research aims to evaluate the impact of domain adaptation and measure the extent to which transfer learning enhances sentiment analysis outcomes. We employed transfer learning models BERT, RoBERTa, ELECTRA, and ULMFiT to improve the performance in sentiment analysis. We analyzed sentiment through various transformer models and compared the performance of LSTM and CNN. The experiments are carried on five publicly available sentiment analysis datasets, namely Hotel Reviews (HR), Movie Reviews (MR), Sentiment140 Tweets (ST), Citation Sentiment Corpus (CSC), and Bioinformatics Citation Corpus (BCC), to adapt multi-target domains. The performance of numerous models employing transfer learning from diverse datasets demonstrating how various factors influence the outputs.
Collapse
|
9
|
Using Natural Language Processing to Explore Social Media Opinions on Food Security: Sentiment Analysis and Topic Modeling Study. J Med Internet Res 2024; 26:e47826. [PMID: 38512326 PMCID: PMC10995791 DOI: 10.2196/47826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 12/05/2023] [Accepted: 12/20/2023] [Indexed: 03/22/2024] Open
Abstract
BACKGROUND Social media has the potential to be of great value in understanding patterns in public health using large-scale analysis approaches (eg, data science and natural language processing [NLP]), 2 of which have been used in public health: sentiment analysis and topic modeling; however, their use in the area of food security and public health nutrition is limited. OBJECTIVE This study aims to explore the potential use of NLP tools to gather insights from real-world social media data on the public health issue of food security. METHODS A search strategy for obtaining tweets was developed using food security terms. Tweets were collected using the Twitter application programming interface from January 1, 2019, to December 31, 2021, filtered for Australia-based users only. Sentiment analysis of the tweets was performed using the Valence Aware Dictionary and Sentiment Reasoner. Topic modeling exploring the content of tweets was conducted using latent Dirichlet allocation with BigML (BigML, Inc). Sentiment, topic, and engagement (the sum of likes, retweets, quotations, and replies) were compared across years. RESULTS In total, 38,070 tweets were collected from 14,880 Twitter users. Overall, the sentiment when discussing food security was positive, although this varied across the 3 years. Positive sentiment remained higher during the COVID-19 lockdown periods in Australia. The topic model contained 10 topics (in order from highest to lowest probability in the data set): "Global production," "Food insecurity and health," "Use of food banks," "Giving to food banks," "Family poverty," "Food relief provision," "Global food insecurity," "Climate change," "Australian food insecurity," and "Human rights." The topic "Giving to food banks," which focused on support and donation, had the highest proportion of positive sentiment, and "Global food insecurity," which covered food insecurity prevalence worldwide, had the highest proportion of negative sentiment. When compared with news, there were some events, such as COVID-19 support payment introduction and bushfires across Australia, that were associated with high periods of positive or negative sentiment. Topics related to food insecurity prevalence, poverty, and food relief in Australia were not consistently more prominent during the COVID-19 pandemic than before the pandemic. Negative tweets received substantially higher engagement across 2019 and 2020. There was no clear relationship between topics that were more likely to be positive or negative and have higher or lower engagement, indicating that the identified topics are discrete issues. CONCLUSIONS In this study, we demonstrated the potential use of sentiment analysis and topic modeling to explore evolution in conversations on food security using social media data. Future use of NLP in food security requires the context of and interpretation by public health experts and the use of broader data sets, with the potential to track dimensions or events related to food security to inform evidence-based decision-making in this area.
Collapse
|
10
|
Has sentiment returned to the pre-pandemic level? A sentiment analysis using U.S. college subreddit data from 2019 to 2022. PLoS One 2024; 19:e0299837. [PMID: 38489275 PMCID: PMC10942064 DOI: 10.1371/journal.pone.0299837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 02/15/2024] [Indexed: 03/17/2024] Open
Abstract
BACKGROUND As the impact of the COVID-19 pandemic winds down, both individuals and society are gradually returning to life and activities before the pandemic. This study aims to explore how people's emotions have changed from the pre-pandemic period during the pandemic to this post-emergency period and whether the sentiment level nowadays has returned to the pre-pandemic level. METHOD We collected Reddit social media data in 2019 (pre-pandemic), 2020 (peak period of the pandemic), 2021, and 2022 (late stages of the pandemic, transitioning period to the post-emergency period) from the subreddits communities in 128 universities/colleges in the U.S., and a set of school-level baseline characteristics such as location, enrollment, graduation rate, selectivity, etc. We predicted two sets of sentiments from a pre-trained Robustly Optimized BERT pre-training approach (RoBERTa) and a graph attention network (GAT) that leverages both the rich semantic information and the relational information among posted messages and then applied model stacking to obtain the final sentiment classification. After obtaining the sentiment label for each message, we employed a generalized linear mixed-effects model to estimate the temporal trend in sentiment from 2019 to 2022 and how the school-level factors may affect the sentiment. RESULTS Compared to the year 2019, the odds of negative sentiment in years 2020, 2021, and 2022 are 25%. 7.3%, and 6.3% higher, respectively, which are all statistically significant at the 5% significance level based on the multiplicity-adjusted p-values. CONCLUSIONS Our study findings suggest a partial recovery in the sentiment composition (negative vs. non-negative) in the post-pandemic-emergency era. The results align with common expectations and provide a detailed quantification of how sentiments have evolved from 2019 to 2022 in the sub-population represented by the sample examined in this study.
Collapse
|
11
|
Evaluation of bias and gender/racial concordance based on sentiment analysis of narrative evaluations of clinical clerkships using natural language processing. BMC MEDICAL EDUCATION 2024; 24:295. [PMID: 38491461 PMCID: PMC10944013 DOI: 10.1186/s12909-024-05271-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 03/06/2024] [Indexed: 03/18/2024]
Abstract
There is increasing interest in understanding potential bias in medical education. We used natural language processing (NLP) to evaluate potential bias in clinical clerkship evaluations. Data from medical evaluations and administrative databases for medical students enrolled in third-year clinical clerkship rotations across two academic years. We collected demographic information of students and faculty evaluators to determine gender/racial concordance (i.e., whether the student and faculty identified with the same demographic). We used a multinomial log-linear model for final clerkship grades, using predictors such as numerical evaluation scores, gender/racial concordance, and sentiment scores of narrative evaluations using the SentimentIntensityAnalyzer tool in Python. 2037 evaluations from 198 students were analyzed. Statistical significance was defined as P < 0.05. Sentiment scores for evaluations did not vary significantly by student gender, race, or ethnicity (P = 0.88, 0.64, and 0.06, respectively). Word choices were similar across faculty and student demographic groups. Modeling showed narrative evaluation sentiment scores were not predictive of an honors grade (odds ratio [OR] 1.23, P = 0.58). Numerical evaluation average (OR 1.45, P < 0.001) and gender concordance between faculty and student (OR 1.32, P = 0.049) were significant predictors of receiving honors. The lack of disparities in narrative text in our study contrasts with prior findings from other institutions. Ongoing efforts include comparative analyses with other institutions to understand what institutional factors may contribute to bias. NLP enables a systematic approach for investigating bias. The insights gained from the lack of association between word choices, sentiment scores, and final grades show potential opportunities to improve feedback processes for students.
Collapse
|
12
|
Chinese text dual attention network for aspect-level sentiment classification. PLoS One 2024; 19:e0295331. [PMID: 38451928 PMCID: PMC10919654 DOI: 10.1371/journal.pone.0295331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Accepted: 11/20/2023] [Indexed: 03/09/2024] Open
Abstract
English text has a clear and compact subject structure, which makes it easy to find dependency relationships between words. However, Chinese text often conveys information using situational settings, which results in loose sentence structures, and even most Chinese comments and experimental summary texts lack subjects. This makes it challenging to determine the dependency relationship between words in Chinese text, especially in aspect-level sentiment recognition. To solve this problem faced by Chinese text in the field of sentiment recognition, a Chinese text dual attention network for aspect-level sentiment recognition is proposed. First, Chinese syntactic dependency is proposed, and sentiment dictionary is introduced to quickly and accurately extract aspect-level sentiment words, opinion extraction and classification of sentimental trends in text. Additionally, in order to extract context-level features, the CNN-BILSTM model and position coding are also introduced. Finally, to better extract fine-grained aspect-level sentiment, a two-level attention mechanism is used. Compared with ten advanced baseline models, the model's capabilities are being further optimized for better performance, with Accuracy of 0.9180, 0.9080 and 0.8380 respectively. This method is being demonstrated by a vast array of experiments to achieve higher performance in aspect-level sentiment recognition in less time, and ablation experiments demonstrate the importance of each module of the model.
Collapse
|
13
|
Enhancing public health response: a framework for topics and sentiment analysis of COVID-19 in the UK using Twitter and the embedded topic model. Front Public Health 2024; 12:1105383. [PMID: 38450124 PMCID: PMC10915179 DOI: 10.3389/fpubh.2024.1105383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 01/10/2024] [Indexed: 03/08/2024] Open
Abstract
Introduction To protect citizens during the COVID-19 pandemic unprecedented public health restrictions were imposed on everyday life in the UK and around the world. In emergencies like COVID-19, it is crucial for policymakers to be able to gauge the public response and sentiment to such measures in almost real-time and establish best practices for the use of social media for emergency response. Methods In this study, we explored Twitter as a data source for assessing public reaction to the pandemic. We conducted an analysis of sentiment by topic using 25 million UK tweets, collected from 26th May 2020 to 8th March 2021. We combined an innovative combination of sentiment analysis via a recurrent neural network and topic clustering through an embedded topic model. Results The results demonstrated interpretable per-topic sentiment signals across time and geography in the UK that could be tied to specific public health and policy events during the pandemic. Unique to this investigation is the juxtaposition of derived sentiment trends against behavioral surveys conducted by the UK Office for National Statistics, providing a robust gauge of the public mood concurrent with policy announcements. Discussion While much of the existing research focused on specific questions or new techniques, we developed a comprehensive framework for the assessment of public response by policymakers for COVID-19 and generalizable for future emergencies. The emergent methodology not only elucidates the public's stance on COVID-19 policies but also establishes a generalizable framework for public policymakers to monitor and assess the buy-in and acceptance of their policies almost in real-time. Further, the proposed approach is generalizable as a tool for policymakers and could be applied to further subjects of political and public interest.
Collapse
|
14
|
Enhancing machine learning-based sentiment analysis through feature extraction techniques. PLoS One 2024; 19:e0294968. [PMID: 38354193 PMCID: PMC10866497 DOI: 10.1371/journal.pone.0294968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 11/12/2023] [Indexed: 02/16/2024] Open
Abstract
A crucial part of sentiment classification is featuring extraction because it involves extracting valuable information from text data, which affects the model's performance. The goal of this paper is to help in selecting a suitable feature extraction method to enhance the performance of sentiment analysis tasks. In order to provide directions for future machine learning and feature extraction research, it is important to analyze and summarize feature extraction techniques methodically from a machine learning standpoint. There are several methods under consideration, including Bag-of-words (BOW), Word2Vector, N-gram, Term Frequency- Inverse Document Frequency (TF-IDF), Hashing Vectorizer (HV), and Global vector for word representation (GloVe). To prove the ability of each feature extractor, we applied it to the Twitter US airlines and Amazon musical instrument reviews datasets. Finally, we trained a random forest classifier using 70% of the training data and 30% of the testing data, enabling us to evaluate and compare the performance using different metrics. Based on our results, we find that the TD-IDF technique demonstrates superior performance, with an accuracy of 99% in the Amazon reviews dataset and 96% in the Twitter US airlines dataset. This study underscores the paramount significance of feature extraction in sentiment analysis, endowing pragmatic insights to elevate model performance and steer future research pursuits.
Collapse
|
15
|
Verification in the Early Stages of the COVID-19 Pandemic: Sentiment Analysis of Japanese Twitter Users. JMIR INFODEMIOLOGY 2024; 4:e37881. [PMID: 38127840 PMCID: PMC10849083 DOI: 10.2196/37881] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 11/20/2023] [Accepted: 12/17/2023] [Indexed: 12/23/2023]
Abstract
BACKGROUND The COVID-19 pandemic prompted global behavioral restrictions, impacting public mental health. Sentiment analysis, a tool for assessing individual and public emotions from text data, gained importance amid the pandemic. This study focuses on Japan's early public health interventions during COVID-19, utilizing sentiment analysis in infodemiology to gauge public sentiment on social media regarding these interventions. OBJECTIVE This study aims to investigate shifts in public emotions and sentiments before and after the first state of emergency was declared in Japan. By analyzing both user-generated tweets and retweets, we aim to discern patterns in emotional responses during this critical period. METHODS We conducted a day-by-day analysis of Twitter (now known as X) data using 4,894,009 tweets containing the keywords "corona," "COVID-19," and "new pneumonia" from March 23 to April 21, 2020, approximately 2 weeks before and after the first declaration of a state of emergency in Japan. We also processed tweet data into vectors for each word, employing the Fuzzy-C-Means (FCM) method, a type of cluster analysis, for the words in the sentiment dictionary. We set up 7 sentiment clusters (negative: anger, sadness, surprise, disgust; neutral: anxiety; positive: trust and joy) and conducted sentiment analysis of the tweet groups and retweet groups. RESULTS The analysis revealed a mix of positive and negative sentiments, with "joy" significantly increasing in the retweet group after the state of emergency declaration. Negative emotions, such as "worry" and "disgust," were prevalent in both tweet and retweet groups. Furthermore, the retweet group had a tendency to share more negative content compared to the tweet group. CONCLUSIONS This study conducted sentiment analysis of Japanese tweets and retweets to explore public sentiments during the early stages of COVID-19 in Japan, spanning 2 weeks before and after the first state of emergency declaration. The analysis revealed a mix of positive (joy) and negative (anxiety, disgust) emotions. Notably, joy increased in the retweet group after the emergency declaration, but this group also tended to share more negative content than the tweet group. This study suggests that the state of emergency heightened positive sentiments due to expectations for infection prevention measures, yet negative information also gained traction. The findings propose the potential for further exploration through network analysis.
Collapse
|
16
|
A new word embedding model integrated with medical knowledge for deep learning-based sentiment classification. Artif Intell Med 2024; 148:102758. [PMID: 38325934 DOI: 10.1016/j.artmed.2023.102758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 11/19/2023] [Accepted: 12/29/2023] [Indexed: 02/09/2024]
Abstract
The development of intelligent systems that use social media data for decision-making processes in numerous domains such as politics, business, marketing, and finance, has been made possible by the popularity of social media platforms. However, the utilization of textual data from social media in the healthcare management industry is still somewhat limited when it is compared to other industries. Investigating how current machine learning and natural language processing technologies can be used in the healthcare industry to gauge public sentiment is an important study. Earlier works on healthcare sentiment analysis have utilized traditional word embedding models trained on the general and medical corpus. However, integration of medical knowledge to pre-trained word embedding models has not been considered yet. Word embedding models trained on the general corpus led to the problem of lacking medical knowledge and the models trained on the small size of the medical corpus have limitations in capturing semantic and syntactic properties. This research proposes a new word embedding model named Word Embedding Integrated with Medical Knowledge Vector (WE-iMKVec). The proposed model integrates sentiment lexicons and medical knowledgebases into the pre-trained word embedding to enrich the properties of word embedding. A new medical-aware sentiment polarity score is proposed for the utilization in learning neural-network sentiment and these vectors incorporate with the original pre-trained word vectors. The resulting vectors are enriched with lexicon vectors and the medical knowledge vectors: Adverse Drug Reaction (ADR) vector and Unified Medical Language System (UMLS) vector are used to build the proposed WE-iMKVec model. WE-iMKVec is validated on the five different social media healthcare review datasets and the empirical results showed its superiority over traditional word embedding models in medical sentiment analysis. The highest improvement can be found in the patients.info medical condition dataset where the proposed model outperforms three conventional word2vec models (Google-News, PubMed-PMC, and Drug Reviews) by 12.7 %, 31.4 %, and 25.4 % respectively in terms of F1 score.
Collapse
|
17
|
Public Opinion About COVID-19 on a Microblog Platform in China: Topic Modeling and Multidimensional Sentiment Analysis of Social Media. J Med Internet Res 2024; 26:e47508. [PMID: 38294856 PMCID: PMC10833090 DOI: 10.2196/47508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 09/09/2023] [Accepted: 12/20/2023] [Indexed: 02/01/2024] Open
Abstract
BACKGROUND The COVID-19 pandemic raised wide concern from all walks of life globally. Social media platforms became an important channel for information dissemination and an effective medium for public sentiment transmission during the COVID-19 pandemic. OBJECTIVE Mining and analyzing social media text information can not only reflect the changes in public sentiment characteristics during the COVID-19 pandemic but also help the government understand the trends in public opinion and reasonably control public opinion. METHODS First, this study collected microblog comments related to the COVID-19 pandemic as a data set. Second, sentiment analysis was carried out based on the topic modeling method combining latent Dirichlet allocation (LDA) and Bidirectional Encoder Representations from Transformers (BERT). Finally, a machine learning linear regression (ML-LR) model combined with a sparse matrix was proposed to explore the evolutionary trend in public opinion on social media and verify the high accuracy of the model. RESULTS The experimental results show that, in different stages, the characteristics of public emotion are different, and the overall trend is from negative to positive. CONCLUSIONS The proposed method can effectively reflect the characteristics of the different times and space of public opinion. The results provide theoretical support and practical reference in response to public health and safety events.
Collapse
|
18
|
A Comparison of ChatGPT and Fine-Tuned Open Pre-Trained Transformers (OPT) Against Widely Used Sentiment Analysis Tools: Sentiment Analysis of COVID-19 Survey Data. JMIR Ment Health 2024; 11:e50150. [PMID: 38271138 PMCID: PMC10813836 DOI: 10.2196/50150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 11/16/2023] [Accepted: 11/17/2023] [Indexed: 01/27/2024] Open
Abstract
BACKGROUND Health care providers and health-related researchers face significant challenges when applying sentiment analysis tools to health-related free-text survey data. Most state-of-the-art applications were developed in domains such as social media, and their performance in the health care context remains relatively unknown. Moreover, existing studies indicate that these tools often lack accuracy and produce inconsistent results. OBJECTIVE This study aims to address the lack of comparative analysis on sentiment analysis tools applied to health-related free-text survey data in the context of COVID-19. The objective was to automatically predict sentence sentiment for 2 independent COVID-19 survey data sets from the National Institutes of Health and Stanford University. METHODS Gold standard labels were created for a subset of each data set using a panel of human raters. We compared 8 state-of-the-art sentiment analysis tools on both data sets to evaluate variability and disagreement across tools. In addition, few-shot learning was explored by fine-tuning Open Pre-Trained Transformers (OPT; a large language model [LLM] with publicly available weights) using a small annotated subset and zero-shot learning using ChatGPT (an LLM without available weights). RESULTS The comparison of sentiment analysis tools revealed high variability and disagreement across the evaluated tools when applied to health-related survey data. OPT and ChatGPT demonstrated superior performance, outperforming all other sentiment analysis tools. Moreover, ChatGPT outperformed OPT, exhibited higher accuracy by 6% and higher F-measure by 4% to 7%. CONCLUSIONS This study demonstrates the effectiveness of LLMs, particularly the few-shot learning and zero-shot learning approaches, in the sentiment analysis of health-related survey data. These results have implications for saving human labor and improving efficiency in sentiment analysis tasks, contributing to advancements in the field of automated sentiment analysis.
Collapse
|
19
|
How satisfied are patients with nursing care and why? A comprehensive study based on social media and opinion mining. Inform Health Soc Care 2024; 49:14-27. [PMID: 38178275 DOI: 10.1080/17538157.2023.2297307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2024]
Abstract
To assess the overall experience of a patient in a hospital, many factors must be analyzed; nonetheless, one of the key aspects is the performance of nurses as they closely interact with patients on many occasions. Nurses carry out many tasks that could be assessed to understand the patient's satisfaction and consequently, the effectiveness of the offered services. To assess their performance, traditionally, expensive, and time-consuming methods such as questionnaires and interviews have been used; nevertheless, the development of social networks has allowed the patients to convey their opinions in a free and public manner. For that reason, in this study, a comprehensive analysis has been performed based on patients' opinions collected from a feedback platform for health and care services, to discover the topics about nurses the patients are more interested in. To do so, a topic modeling technique has been proposed. After this, sentiment analysis has been applied to classify the topics as satisfactory or unsatisfactory. Finally, the results have been compared with what the patients think about doctors. The results highlight what topics are most relevant to assess the patient satisfaction and to what extent. The results remark that the opinion about nurses is, in general, more positive than about doctors.
Collapse
|
20
|
Content and sentiment analysis of gabapentinoid-related tweets: An infodemiology study. Drug Alcohol Rev 2024; 43:45-55. [PMID: 36539307 DOI: 10.1111/dar.13590] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Revised: 11/22/2022] [Accepted: 11/30/2022] [Indexed: 12/24/2022]
Abstract
INTRODUCTION The increasing number of gabapentinoid (pregabalin and gabapentin) harms, including deaths observed across countries is concerning to health-care professionals and policy makers. However, it is unclear if the public shares these concerns. This study aimed to describe posts related to gabapentinoids, conduct a content analysis to identify common themes and describe adverse events or symptoms. METHODS Keywords of 'pregabalin' or 'Lyrica' or 'gabapentin' or 'Neurontin' were used to search for related tweets posted by people in the community between 8 March and 7 May 2021. Eligible tweets included a keyword in the post. We extracted de-identified data which included descriptive data of the total number of posts over time; and data on individual tweets including date, number of re-tweets and post content. Data were exported separately for pregabalin- and gabapentin-related tweets. A 20% random sample was used for the thematic analysis. RESULTS There were 2931 pregabalin-related tweets and 2736 gabapentin-related tweets. Thematic analysis revealed three themes (sharing positive experiences and benefits of taking gabapentinoids, people voicing their negative experiences, and people seeking opinions and sharing information). Positive experiences of gabapentinoids were related to sharing stories and giving advice. This was contrasted to negative experiences including ineffectiveness, withdrawals, side effects and frustration related to cost and insurance coverage. Brain fog was the most common adverse symptom reported. Gabapentinoid-related deaths were only mentioned in three tweets. DISCUSSION The increasing public health concern of gabapentinoid-related deaths was not translated to Twitter discussions.
Collapse
|
21
|
Sentiment analysis of the COVID-19 vaccine perception. Health Informatics J 2024; 30:14604582241236131. [PMID: 38403926 DOI: 10.1177/14604582241236131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
The sharp rise in coronavirus cases in the United States, as well as other countries, is driven by variants such as the Omicron substrain, BA4 and BA5. Keeping up to date with COVID-19 vaccination and wearing masks are essential tools for mitigating the pandemic. Social media plays a vital role in sharing and exchanging information, but it also affects perceptions of social phenomena. In this study, we conducted sentiment analysis and topic modeling to investigate vaccine perception using 338,465 COVID-19 vaccine-related comments collected from January 2020 to May 2021 on Reddit. This study stands apart from prior COVID-related research on social media, particularly on Reddit, as it conducted separate analyses for each COVID vaccine and examines public sentiment with various societal events, including vaccine development progress and government responses to COVID. The findings reveal two notable spikes in the number of comments containing the keyword "vaccine". This suggests that discussions about vaccines tend to increase during times of significant social and political events, indicating that people's attention and interest in the topic are influenced by current events. Understanding the public perception of vaccines and identifying factors influencing vaccine perception could help propose appropriate interventions to promote vaccination.
Collapse
|
22
|
Applying BERT and ChatGPT for Sentiment Analysis of Lyme Disease in Scientific Literature. Methods Mol Biol 2024; 2742:173-183. [PMID: 38165624 DOI: 10.1007/978-1-0716-3561-2_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2024]
Abstract
This chapter presents a practical guide for conducting sentiment analysis using Natural Language Processing (NLP) techniques in the domain of tick-borne disease text. The aim is to demonstrate the process of how the presence of bias in the discourse surrounding chronic manifestations of the disease can be evaluated. The goal is to use a dataset of 5643 abstracts collected from scientific journals on the topic of chronic Lyme disease to demonstrate using Python, the steps for conducting sentiment analysis using pretrained language models and the process of validating the preliminary results using both interpretable machine learning tools, as well as a novel methodology of leveraging emerging state-of-the-art large language models like ChatGPT. This serves as a useful resource for researchers and practitioners interested in using NLP techniques for sentiment analysis in the medical domain.
Collapse
|
23
|
Depression detection for twitter users using sentiment analysis in English and Arabic tweets. Artif Intell Med 2024; 147:102716. [PMID: 38184345 DOI: 10.1016/j.artmed.2023.102716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 11/06/2023] [Accepted: 11/08/2023] [Indexed: 01/08/2024]
Abstract
Since depression often results in suicidal thoughts and leaves a person severely disabled daily, there is an elevated risk of premature mortality due to mental problems caused by depression. Therefore, it's crucial to identify the patient's mental illness as soon as possible. People are increasingly using social media platforms to express their opinions and share daily activities, which makes online platforms rich sources of early depression detection. The contribution of this paper is multifold. First, it presents five machine-learning models for Arabic and English depression detection using Twitter text. The best model for Arabic text achieved an f1-score of 96.6 % for binary classification to depressed and Non_dep. For English text without negation, the model achieved 92 % for binary classification and 88 % for multi-classification (depressed, indifferent, happy). For English text with negation, an 87 %, and 85 % f1 score was achieved for binary and multi-classification respectively. Second, the work introduced a manually annotated Arabic_Dep_tweets_10,000 corpus of 10.000 Arabic tweets, which covered neutral tweets as well as a variety of depressed and happy terms. In addition, two automatically annotated English corpora, Eng_without_negation_60.000 corpus of 60,172 English tweets and Eng_with_negation_57.000 corpus of 57,392 English tweets. Both covered a wide range of depressed and cheerful terms; however, Negation was included in the Eng_with_negation_57.000 corpus. Finally, this paper exposes a depression-detection web application which implements our optimal models to detect tweets that contain depression symptoms and predict depression trends for a person either using English or Arabic language.
Collapse
|
24
|
Microvascular Decompression and Trigeminal Neuralgia: Patient Sentiment Analysis Using Natural Language Processing. World Neurosurg 2023; 180:e528-e536. [PMID: 37778624 DOI: 10.1016/j.wneu.2023.09.107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 09/25/2023] [Indexed: 10/03/2023]
Abstract
OBJECTIVE Microvascular decompression (MVD) as a treatment for trigeminal neuralgia (TGN) has high success rate but is associated with risks of complication. This study analyzes Twitter to provide insights into discussions surrounding MVD for patients with TGN. METHODS A Twitter search performed in April 2022 yielded 491 tweets from 426 accounts. Tweets and accounts were classified thematically, and descriptive statistics were used for various social media metrics. Using a natural language processing machine learning algorithm, sentiment analysis (SA) was performed to evaluate patient perspectives before and after surgery, and a multivariate regression model was used to identify predictors of higher engagement metrics (likes, retweets, quote tweets, replies). RESULTS Most accounts were patients, caregivers, and other members of the public (70%). The most encountered themes were research (47%) and personal experiences (33.4%). SA of tweets about patient experiences showed that 40.2% of tweets were positive, 31.1% were neutral and 28.7% were negative. Negative tweets decreased significantly in postoperative tweets and mostly discussed complications or failure of surgery (63%). On multivariate analysis, only inclusion of media (photo or video) in a Tweet was associated with higher engagement metrics. CONCLUSIONS This study provides a comprehensive review of Twitter use discussing MVD in TGN and is the first to assess patient satisfaction after treatment using SA. The data presented on patient perspectives on social media could help physicians establish direct lines of communication with patients, fostering a more patient-focused care.
Collapse
|
25
|
Hashtags in Plastic Surgery: A Sentiment Analysis of over 1 Million Tweets. Aesthetic Plast Surg 2023; 47:2874-2879. [PMID: 37037924 DOI: 10.1007/s00266-023-03340-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2022] [Accepted: 03/23/2023] [Indexed: 04/12/2023]
Abstract
BACKGROUND Current literature has sparse recommendations that guide social networking practices in plastic surgery. To address this, we used natural language processing and sentiment analysis to investigate the differences in plastic surgery-related terms and hashtags on Twitter. METHODS Over 1 million tweets containing keywords #plasticsurgery, #cosmeticsurgery, and their non-hashtagged versions plastic surgery and cosmetic surgery were collected from the Twitter Gardenhose feed spanning from 2012 to 2016. We extracted the average happiness/positivity (h-avg) using hedonometrics and created word-shift graphs to determine influential words. RESULTS The most popular keywords were plastic and cosmetic surgery, comprising more than 90% of the sample. The positivity scores for plastic surgery, cosmetic surgery, #plasticsurgery, and #cosmeticsurgery were 5.72, 6.00, 6.17, and 6.18, respectively. Compared to plastic surgery, the term cosmetic surgery was more positive because it lacked antagonistic words, such as "fake," "ugly," "bad," "fails," and "wrong." For similar reasons, #plasticsurgery and #cosmeticsurgery were more positively associated than their non-hashtagged counterparts. CONCLUSION Plastic surgery-related hashtags are more positively associated than their non-hashtagged versions. The language associated with such hashtags suggests a different user profile than the public and, given their underutilization, remain viable channels for professionals to achieve their diverse social media goals. LEVEL OF EVIDENCE V This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266 .
Collapse
|
26
|
Use of sentiment analysis for capturing hospitalized cancer patients' experience from free-text comments in the Persian language. BMC Med Inform Decis Mak 2023; 23:275. [PMID: 38031102 PMCID: PMC10685532 DOI: 10.1186/s12911-023-02358-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 10/30/2023] [Indexed: 12/01/2023] Open
Abstract
PURPOSE Today, the Internet provides access to many patients' experiences, which is crucial in assessing the quality of healthcare services. This paper introduces a model for detecting cancer patients' opinions about healthcare services in the Persian language, both positive and negative. METHOD To achieve the objectives of this study, a combination of sentiment analysis (SA) and topic modeling approaches was employed. All pertinent comments made by cancer patients were collected from the patient feedback form of the Tehran University of Medical Science (TUMS) Cancer Institute (CI) in Iran, from March to October 2021. Conventional evaluation metrics such as accuracy, precision, recall, and F-measure were utilized to assess the performance of the proposed model. RESULT The experimental findings revealed that the proposed SA model achieved accuracies of 89.3%, 92.6%, and 90.8% in detecting patients' sentiments towards general services, healthcare services, and life expectancy, respectively. Based on the topic modeling results, the topic "Metastasis" exhibited lower sentiment scores compared to other topics. Additionally, cancer patients expressed dissatisfaction with the current appointment booking service, while topics such as "Good experience," "Affable staff", and "Chemotherapy" garnered higher sentiment scores. CONCLUSION The combined use of SA and topic modeling offers valuable insights into healthcare services. Policymakers can utilize the knowledge obtained from these topics and associated sentiments to enhance patient satisfaction with cancer institution services.
Collapse
|
27
|
Online attitudes about the first approved systemic treatment for alopecia areata: a sentiment analysis of Reddit posts. Clin Exp Dermatol 2023; 48:1369-1370. [PMID: 37503761 DOI: 10.1093/ced/llad254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 08/18/2023] [Indexed: 07/29/2023]
Abstract
Alopecia areata (AA) is a complex inflammatory skin disease with a tremendous physical and emotional burden. Understanding patient sentiment towards AA treatment, particularly those treatments approved for AA, may help in better addressing patient needs. Here, we analysed 13 771 Reddit posts from the ‘alopecia_areata’ subreddit using sentiment analysis to evaluate online attitudes about the first approved systemic treatment for AA, baricitinib. We show that posts including baricitinib or related terms are more likely to be positive, with a higher likelihood of being positive after Food and Drug Administration approval.
Collapse
|
28
|
MetaQA: Enhancing human-centered data search using Generative Pre-trained Transformer (GPT) language model and artificial intelligence. PLoS One 2023; 18:e0293034. [PMID: 37956160 PMCID: PMC10642800 DOI: 10.1371/journal.pone.0293034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Accepted: 10/03/2023] [Indexed: 11/15/2023] Open
Abstract
Accessing and utilizing geospatial data from various sources is essential for developing scientific research to address complex scientific and societal challenges that require interdisciplinary knowledge. The traditional keyword-based geosearch approach is insufficient due to the uncertainty inherent within spatial information and how it is presented in the data-sharing platform. For instance, the Gulf of Mexico Coastal Ocean Observing System (GCOOS) data search platform stores geoinformation and metadata in a complex tabular. Users can search for data by entering keywords or selecting data from a drop-down manual from the user interface. However, the search results provide limited information about the data product, where detailed descriptions, potential use, and relationship with other data products are still missing. Language models (LMs) have demonstrated great potential in tasks like question answering, sentiment analysis, text classification, and machine translation. However, they struggle when dealing with metadata represented in tabular format. To overcome these challenges, we developed Meta Question Answering System (MetaQA), a novel spatial data search model. MetaQA integrates end-to-end AI models with a generative pre-trained transformer (GPT) to enhance geosearch services. Using GCOOS metadata as a case study, we tested the effectiveness of MetaQA. The results revealed that MetaQA outperforms state-of-the-art question-answering models in handling tabular metadata, underlining its potential for user-inspired geosearch services.
Collapse
|
29
|
Comparing text mining and manual coding methods: Analysing interview data on quality of care in long-term care for older adults. PLoS One 2023; 18:e0292578. [PMID: 37939098 PMCID: PMC10631650 DOI: 10.1371/journal.pone.0292578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 09/24/2023] [Indexed: 11/10/2023] Open
Abstract
OBJECTIVES In long-term care for older adults, large amounts of text are collected relating to the quality of care, such as transcribed interviews. Researchers currently analyze textual data manually to gain insights, which is a time-consuming process. Text mining could provide a solution, as this methodology can be used to analyze large amounts of text automatically. This study aims to compare text mining to manual coding with regard to sentiment analysis and thematic content analysis. METHODS Data were collected from interviews with residents (n = 21), family members (n = 20), and care professionals (n = 20). Text mining models were developed and compared to the manual approach. The results of the manual and text mining approaches were evaluated based on three criteria: accuracy, consistency, and expert feedback. Accuracy assessed the similarity between the two approaches, while consistency determined whether each individual approach found the same themes in similar text segments. Expert feedback served as a representation of the perceived correctness of the text mining approach. RESULTS An accuracy analysis revealed that more than 80% of the text segments were assigned the same themes and sentiment using both text mining and manual approaches. Interviews coded with text mining demonstrated higher consistency compared to those coded manually. Expert feedback identified certain limitations in both the text mining and manual approaches. CONCLUSIONS AND IMPLICATIONS While these analyses highlighted the current limitations of text mining, they also exposed certain inconsistencies in manual analysis. This information suggests that text mining has the potential to be an effective and efficient tool for analysing large volumes of textual data in the context of long-term care for older adults.
Collapse
|
30
|
Temporal and Emotional Variations in People's Perceptions of Mass Epidemic Infectious Disease After the COVID-19 Pandemic Using Influenza A as an Example: Topic Modeling and Sentiment Analysis Based on Weibo Data. J Med Internet Res 2023; 25:e49300. [PMID: 37917144 PMCID: PMC10654902 DOI: 10.2196/49300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 09/20/2023] [Accepted: 10/11/2023] [Indexed: 11/03/2023] Open
Abstract
BACKGROUND The COVID-19 pandemic has had profound impacts on society, including public health, the economy, daily life, and social interactions. Social distancing measures, travel restrictions, and the influx of pandemic-related information on social media have all led to a significant shift in how individuals perceive and respond to health crises. In this context, there is a growing awareness of the role that social media platforms such as Weibo, among the largest and most influential social media sites in China, play in shaping public sentiment and influencing people's behavior during public health emergencies. OBJECTIVE This study aims to gain a comprehensive understanding of the sociospatial impact of mass epidemic infectious disease by analyzing the spatiotemporal variations and emotional orientations of the public after the COVID-19 pandemic. We use the outbreak of influenza A after the COVID-19 pandemic as a case study. Through temporal and spatial analyses, we aim to uncover specific variations in the attention and emotional orientations of people living in different provinces in China regarding influenza A. We sought to understand the societal impact of large-scale infectious diseases and the public's stance after the COVID-19 pandemic to improve public health policies and communication strategies. METHODS We selected Weibo as the data source and collected all influenza A-related Weibo posts from November 1, 2022, to March 31, 2023. These data included user names, geographic locations, posting times, content, repost counts, comments, likes, user types, and more. Subsequently, we used latent Dirichlet allocation topic modeling to analyze the public's focus as well as the bidirectional long short-term memory model to conduct emotional analysis. We further classified the focus areas and emotional orientations of different regions. RESULTS The research findings indicate that, compared with China's western provinces, the eastern provinces exhibited a higher volume of Weibo posts, demonstrating a greater interest in influenza A. Moreover, inland provinces displayed elevated levels of concern compared with coastal regions. In addition, female users of Weibo exhibited a higher level of engagement than male users, with regular users comprising the majority of user types. The public's focus was categorized into 23 main themes, with the overall emotional sentiment predominantly leaning toward negativity (making up 7562 out of 9111 [83%] sentiments). CONCLUSIONS The results of this study underscore the profound societal impact of the COVID-19 pandemic. People tend to be pessimistic toward new large-scale infectious diseases, and disparities exist in the levels of concern and emotional sentiments across different regions. This reflects diverse societal responses to health crises. By gaining an in-depth understanding of the public's attitudes and focal points regarding these infectious diseases, governments and decision makers can better formulate policies and action plans to cater to the specific needs of different regions and enhance public health awareness.
Collapse
|
31
|
Amharic political sentiment analysis using deep learning approaches. Sci Rep 2023; 13:17982. [PMID: 37864050 PMCID: PMC10589327 DOI: 10.1038/s41598-023-45137-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 10/16/2023] [Indexed: 10/22/2023] Open
Abstract
This study delves into the realm of sentiment analysis in the Amharic language, focusing on political sentences extracted from social media platforms in Ethiopia. The research employs deep learning techniques, including Convolutional Neural Networks (CNN), Bidirectional Long Short-Term Memory (Bi-LSTM), and a hybrid model combining CNN with Bi-LSTM to analyze and classify sentiments. The hybrid CNN-Bi-LSTM model emerges as the top performer, achieving an impressive accuracy of 91.60%. While these results mark a significant milestone, challenges persist, such as the need for a more extensive and diverse dataset and the identification of nuanced sentiments like sarcasm and figurative speech. The study underscores the importance of transitioning from binary sentiment analysis to a multi-class classification approach, enabling a finer-grained understanding of sentiments. Moreover, the establishment of a standardized corpus for Amharic sentiment analysis emerges as a critical endeavor with broad applicability beyond politics, spanning domains like agriculture, industry, tourism, sports, entertainment, and satisfaction analysis. The exploration of sarcastic comments in the Amharic language stands out as a promising avenue for future research.
Collapse
|
32
|
Evaluating the Effects of Misinformation on Public Sentiments Surrounding Access to Abortion Through Social Media Sentiment Analytics. Stud Health Technol Inform 2023; 309:304-305. [PMID: 37869866 DOI: 10.3233/shti230805] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2023]
Abstract
As social media use has grown in recent years, ease of access and rapid data collection through online social media has permitted researchers to measure and track sentiments related to emerging public health threats. Herein, we explore the possibilities of examining messaging shared via social media networks for sentiment classification as it relates to women's reproductive healthcare, especially access to abortion. In our previous works, our team has successfully employed various natural language processing (NLP) models for the analysis of social media shared sentiments. This study reports a work-in-progress on the similar use of fine-tuned NLPs (i.e., DistilRoBERTa) to collect/analyze the sentiments of socio-behavioral data shared via social networks to uncover a correlation between reproductive-related misinformation (i.e., access to abortion) and public sentiments/discourse direction.
Collapse
|
33
|
Content framing role on public sentiment formation for pre-crisis detection on sensitive issue via sentiment analysis and content analysis. PLoS One 2023; 18:e0287367. [PMID: 37851696 PMCID: PMC10584141 DOI: 10.1371/journal.pone.0287367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Accepted: 06/04/2023] [Indexed: 10/20/2023] Open
Abstract
Social media has been tremendously used worldwide for a variety of purposes. Therefore, engagement activities such as comments have attracted many scholars due its ability to reveal many critical findings, such as the role of users' sentiment. However, there is a lacuna on how to detect crisis based on users' sentiment through comments, and for such, we explore framing theory in the study herein to determine users' sentiment in predicting crisis. Generic content framing theory consists of conflict, economic, human interest, morality, and responsibility attributes frame as independent variables whilst sentiment as dependent variables. Comments from selected Facebook posting case studies were extracted and analysed using sentiment analysis via Application Programme Interface (API) webtool. The comments were then further analysed using content analysis via Positive and Negative Affect Schedule (PANAS) scale and statistically evaluated using SEM-PLS. Model shows that 44.8% of emotion and reactions towards sensitive issue posting are influenced by independent variables. Only economic consequences and responsibility attributes frame had correlation towards emotion and reaction at p<0.05. News reporting on direction towards economic and responsibility attributes sparks negative sentiment, which proves that it can best be described as pre-crisis detection to assist the Royal Malaysian Police and other relevant stakeholders to prevent criminal activities in their respective social media.
Collapse
|
34
|
Quantum computing and machine learning for Arabic language sentiment classification in social media. Sci Rep 2023; 13:17305. [PMID: 37828056 PMCID: PMC10570340 DOI: 10.1038/s41598-023-44113-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Accepted: 10/03/2023] [Indexed: 10/14/2023] Open
Abstract
With the increasing amount of digital data generated by Arabic speakers, the need for effective and efficient document classification techniques is more important than ever. In recent years, both quantum computing and machine learning have shown great promise in the field of document classification. However, there is a lack of research investigating the performance of these techniques on the Arabic language. This paper presents a comparative study of quantum computing and machine learning for two datasets of Arabic language document classification. In the first dataset of 213,465 Arabic tweets, both classic machine learning (ML) and quantum computing approaches achieve high accuracy in sentiment analysis, with quantum computing slightly outperforming classic ML. Quantum computing completes the task in approximately 59 min, slightly faster than classic ML, which takes around 1 h. The precision, recall, and F1 score metrics indicate the effectiveness of both approaches in predicting sentiment in Arabic tweets. Classic ML achieves precision, recall, and F1 score values of 0.8215, 0.8175, and 0.8121, respectively, while quantum computing achieves values of 0.8239, 0.8199, and 0.8147, respectively. In the second dataset of 44,000 tweets, both classic ML (using the Random Forest algorithm) and quantum computing demonstrate significantly reduced processing times compared to the first dataset, with no substantial difference between them. Classic ML completes the analysis in approximately 2 min, while quantum computing takes approximately 1 min and 53 s. The accuracy of classic ML is higher at 0.9241 compared to 0.9205 for quantum computing. However, both approaches achieve high precision, recall, and F1 scores, indicating their effectiveness in accurately predicting sentiment in the dataset. Classic ML achieves precision, recall, and F1 score values of 0.9286, 0.9241, and 0.9249, respectively, while quantum computing achieves values of 0.92456, 0.9205, and 0.9214, respectively. The analysis of the metrics indicates that quantum computing approaches are effective in identifying positive instances and capturing relevant sentiment information in large datasets. On the other hand, traditional machine learning techniques exhibit faster processing times when dealing with smaller dataset sizes. This study provides valuable insights into the strengths and limitations of quantum computing and machine learning for Arabic document classification, emphasizing the potential of quantum computing in achieving high accuracy, particularly in scenarios where traditional machine learning techniques may encounter difficulties. These findings contribute to the development of more accurate and efficient document classification systems for Arabic data.
Collapse
|
35
|
Sentiment Analysis of Tweets on Menu Labeling Regulations in the US. Nutrients 2023; 15:4269. [PMID: 37836553 PMCID: PMC10574510 DOI: 10.3390/nu15194269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 10/01/2023] [Accepted: 10/05/2023] [Indexed: 10/15/2023] Open
Abstract
Menu labeling regulations in the United States mandate chain restaurants to display calorie information for standard menu items, intending to facilitate healthy dietary choices and address obesity concerns. For this study, we utilized machine learning techniques to conduct a novel sentiment analysis of public opinions regarding menu labeling regulations, drawing on Twitter data from 2008 to 2022. Tweets were collected through a systematic search strategy and annotated as positive, negative, neutral, or news. Our temporal analysis revealed that tweeting peaked around major policy announcements, with a majority categorized as neutral or news-related. The prevalence of news tweets declined after 2017, as neutral views became more common over time. Deep neural network models like RoBERTa achieved strong performance (92% accuracy) in classifying sentiments. Key predictors of tweet sentiments identified by the random forest model included the author's followers and tweeting activity. Despite limitations such as Twitter's demographic biases, our analysis provides unique insights into the evolution of perceptions on the regulations since their inception, including the recent rise in negative sentiment. It underscores social media's utility for continuously monitoring public attitudes to inform health policy development, execution, and refinement.
Collapse
|
36
|
Analyzing influence of COVID-19 on crypto & financial markets and sentiment analysis using deep ensemble model. PLoS One 2023; 18:e0286541. [PMID: 37768959 PMCID: PMC10538772 DOI: 10.1371/journal.pone.0286541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2022] [Accepted: 05/18/2023] [Indexed: 09/30/2023] Open
Abstract
COVID-19 affected the world's economy severely and increased the inflation rate in both developed and developing countries. COVID-19 also affected the financial markets and crypto markets significantly, however, some crypto markets flourished and touched their peak during the pandemic era. This study performs an analysis of the impact of COVID-19 on public opinion and sentiments regarding the financial markets and crypto markets. It conducts sentiment analysis on tweets related to financial markets and crypto markets posted during COVID-19 peak days. Using sentiment analysis, it investigates the people's sentiments regarding investment in these markets during COVID-19. In addition, damage analysis in terms of market value is also carried out along with the worse time for financial and crypto markets. For analysis, the data is extracted from Twitter using the SNSscraper library. This study proposes a hybrid model called CNN-LSTM (convolutional neural network-long short-term memory model) for sentiment classification. CNN-LSTM outperforms with 0.89, and 0.92 F1 Scores for crypto and financial markets, respectively. Moreover, topic extraction from the tweets is also performed along with the sentiments related to each topic.
Collapse
|
37
|
Construction of an Emotional Lexicon of Patients With Breast Cancer: Development and Sentiment Analysis. J Med Internet Res 2023; 25:e44897. [PMID: 37698914 PMCID: PMC10523220 DOI: 10.2196/44897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 08/17/2023] [Accepted: 08/18/2023] [Indexed: 09/13/2023] Open
Abstract
BACKGROUND The innovative method of sentiment analysis based on an emotional lexicon shows prominent advantages in capturing emotional information, such as individual attitudes, experiences, and needs, which provides a new perspective and method for emotion recognition and management for patients with breast cancer (BC). However, at present, sentiment analysis in the field of BC is limited, and there is no emotional lexicon for this field. Therefore, it is necessary to construct an emotional lexicon that conforms to the characteristics of patients with BC so as to provide a new tool for accurate identification and analysis of the patients' emotions and a new method for their personalized emotion management. OBJECTIVE This study aimed to construct an emotional lexicon of patients with BC. METHODS Emotional words were obtained by merging the words in 2 general sentiment lexicons, the Chinese Linguistic Inquiry and Word Count (C-LIWC) and HowNet, and the words in text corpora acquired from patients with BC via Weibo, semistructured interviews, and expressive writing. The lexicon was constructed using manual annotation and classification under the guidance of Russell's valence-arousal space. Ekman's basic emotional categories, Lazarus' cognitive appraisal theory of emotion, and a qualitative text analysis based on the text corpora of patients with BC were combined to determine the fine-grained emotional categories of the lexicon we constructed. Precision, recall, and the F1-score were used to evaluate the lexicon's performance. RESULTS The text corpora collected from patients in different stages of BC included 150 written materials, 17 interviews, and 6689 original posts and comments from Weibo, with a total of 1,923,593 Chinese characters. The emotional lexicon of patients with BC contained 9357 words and covered 8 fine-grained emotional categories: joy, anger, sadness, fear, disgust, surprise, somatic symptoms, and BC terminology. Experimental results showed that precision, recall, and the F1-score of positive emotional words were 98.42%, 99.73%, and 99.07%, respectively, and those of negative emotional words were 99.73%, 98.38%, and 99.05%, respectively, which all significantly outperformed the C-LIWC and HowNet. CONCLUSIONS The emotional lexicon with fine-grained emotional categories conforms to the characteristics of patients with BC. Its performance related to identifying and classifying domain-specific emotional words in BC is better compared to the C-LIWC and HowNet. This lexicon not only provides a new tool for sentiment analysis in the field of BC but also provides a new perspective for recognizing the specific emotional state and needs of patients with BC and formulating tailored emotional management plans.
Collapse
|
38
|
Neuromorphic Sentiment Analysis Using Spiking Neural Networks. SENSORS (BASEL, SWITZERLAND) 2023; 23:7701. [PMID: 37765758 PMCID: PMC10536645 DOI: 10.3390/s23187701] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 08/25/2023] [Accepted: 09/02/2023] [Indexed: 09/29/2023]
Abstract
Over the past decade, the artificial neural networks domain has seen a considerable embracement of deep neural networks among many applications. However, deep neural networks are typically computationally complex and consume high power, hindering their applicability for resource-constrained applications, such as self-driving vehicles, drones, and robotics. Spiking neural networks, often employed to bridge the gap between machine learning and neuroscience fields, are considered a promising solution for resource-constrained applications. Since deploying spiking neural networks on traditional von-Newman architectures requires significant processing time and high power, typically, neuromorphic hardware is created to execute spiking neural networks. The objective of neuromorphic devices is to mimic the distinctive functionalities of the human brain in terms of energy efficiency, computational power, and robust learning. Furthermore, natural language processing, a machine learning technique, has been widely utilized to aid machines in comprehending human language. However, natural language processing techniques cannot also be deployed efficiently on traditional computing platforms. In this research work, we strive to enhance the natural language processing traits/abilities by harnessing and integrating the SNNs traits, as well as deploying the integrated solution on neuromorphic hardware, efficiently and effectively. To facilitate this endeavor, we propose a novel, unique, and efficient sentiment analysis model created using a large-scale SNN model on SpiNNaker neuromorphic hardware that responds to user inputs. SpiNNaker neuromorphic hardware typically can simulate large spiking neural networks in real time and consumes low power. We initially create an artificial neural networks model, and then train the model using an Internet Movie Database (IMDB) dataset. Next, the pre-trained artificial neural networks model is converted into our proposed spiking neural networks model, called a spiking sentiment analysis (SSA) model. Our SSA model using SpiNNaker, called SSA-SpiNNaker, is created in such a way to respond to user inputs with a positive or negative response. Our proposed SSA-SpiNNaker model achieves 100% accuracy and only consumes 3970 Joules of energy, while processing around 10,000 words and predicting a positive/negative review. Our experimental results and analysis demonstrate that by leveraging the parallel and distributed capabilities of SpiNNaker, our proposed SSA-SpiNNaker model achieves better performance compared to artificial neural networks models. Our investigation into existing works revealed that no similar models exist in the published literature, demonstrating the uniqueness of our proposed model. Our proposed work would offer a synergy between SNNs and NLP within the neuromorphic computing domain, in order to address many challenges in this domain, including computational complexity and power consumption. Our proposed model would not only enhance the capabilities of sentiment analysis but also contribute to the advancement of brain-inspired computing. Our proposed model could be utilized in other resource-constrained and low-power applications, such as robotics, autonomous, and smart systems.
Collapse
|
39
|
Off-label drug use during the COVID-19 pandemic in Africa: topic modelling and sentiment analysis of ivermectin in South Africa and Nigeria as a case study. J R Soc Interface 2023; 20:20230200. [PMID: 37700708 PMCID: PMC10498353 DOI: 10.1098/rsif.2023.0200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 07/18/2023] [Indexed: 09/14/2023] Open
Abstract
Although rejected by the World Health Organization, the human and even veterinary formulation of ivermectin has widely been used for prevention and treatment of COVID-19. In this work we leverage Twitter to understand the reasons for the drug use from ivermectin supporters, their source of information, their emotions, their gender demographics, and location information, in Nigeria and South Africa. Topic modelling is performed on a Twitter dataset gathered using keywords 'ivermectin' and 'ivm'. A model is fine-tuned on RoBERTa to find the stance of the tweets. Statistical analysis is performed to compare the stance and emotions. Most ivermectin supporters either redistribute conspiracy theories posted by influencers, or refer to flawed studies confirming ivermectin efficacy in vitro. Three emotions have the highest intensity, optimism, joy and disgust. The number of anti-ivermectin tweets has a significant positive correlation with vaccination rate. All the provinces in South Africa and most of the provinces of Nigeria are pro-ivermectin and have higher disgust polarity. This work makes the effort to understand public discussions regarding ivermectin during the COVID-19 pandemic to help policy-makers understand the rationale behind its popularity, and inform more targeted policies to discourage self-administration of ivermectin. Moreover, it is a lesson to future outbreaks.
Collapse
|
40
|
Sentiment Analysis of Tweets on Soda Taxes. JOURNAL OF PUBLIC HEALTH MANAGEMENT AND PRACTICE 2023; 29:633-639. [PMID: 36812042 DOI: 10.1097/phh.0000000000001721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/24/2023]
Abstract
CONTEXT As a primary source of added sugars, sugar-sweetened beverage (SSB) consumption may contribute to the obesity epidemic. A soda tax is an excise tax charged on selling SSBs to reduce consumption. Currently, 8 cities/counties in the United States have imposed soda taxes. OBJECTIVE This study assessed people's sentiments toward soda taxes in the United States based on social media posts on Twitter. DESIGN We designed a search algorithm to systematically identify and collect soda tax-related tweets posted on Twitter. We built deep neural network models to classify tweets by sentiments. SETTING Computer modeling. PARTICIPANTS Approximately 370 000 soda tax-related tweets posted on Twitter from January 1, 2015, to April 16, 2022. MAIN OUTCOME MEASURE Sentiment associated with a tweet. RESULTS Public attention paid to soda taxes, indicated by the number of tweets posted annually, peaked in 2016, but has declined considerably ever since. The decreasing prevalence of tweets quoting soda tax-related news without revealing sentiments coincided with the rapid increase in tweets expressing a neutral sentiment toward soda taxes. The prevalence of tweets expressing a negative sentiment rose steadily from 2015 to 2019 and then slightly leveled off, whereas that of tweets expressing a positive sentiment remained unchanged. Excluding news-quoting tweets, tweets with neutral, negative, and positive sentiments occupied roughly 56%, 29%, and 15%, respectively, during 2015-2022. The authors' total number of tweets posted, followers, and retweets predicted tweet sentiment. The finalized neural network model achieved an accuracy of 88% and an F1 score of 0.87 in predicting tweet sentiments in the test set. CONCLUSIONS Despite its potential to shape public opinion and catalyze social changes, social media remains an underutilized source of information to inform government decision making. Social media sentiment analysis may inform the design, implementation, and modification of soda tax policies to gain social support while minimizing confusion and misinterpretation.
Collapse
|
41
|
Automated Analysis of Preceptor Comments: A Pilot Study Using Sentiment Analysis to Identify Potential Student Issues in Experiential Education. AMERICAN JOURNAL OF PHARMACEUTICAL EDUCATION 2023; 87:100005. [PMID: 37714650 DOI: 10.1016/j.ajpe.2023.02.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/17/2023]
Abstract
OBJECTIVE The purpose of this paper is to describe a sentiment analysis program that aids in identifying pharmacy students at risk for progression issues by automatically scoring preceptor comments as positive or negative. METHODS An R-based program to analyze advanced pharmacy practice experiences and introductory pharmacy practice experiences midpoint evaluation of preceptor comments was piloted in phase 1 by comparing the sentiment analysis algorithm results to human coding. The algorithm was refined in phase 2. In phase 3, the validation phase, the final sentiment analysis algorithm analyzed all midpoint student evaluations (n = 1560). Sentiment scores were generated for each preceptor comment, and correlations were performed between sentiment scores and the quantitative scoring provided on the assessment. RESULTS In phase 1, agreement between faculty coders and sentiment analysis was 96%, and in phase 2, agreement between the final codes and sentiment analysis was 92.4% once keywords were added to the sentiment dictionary. In phase 3, a total of 3919 comments from 1560 evaluations were analyzed, and overall, the sentiment analysis results aligned with the quantitative data. CONCLUSION This sentiment analysis algorithm was accurate in capturing positive and negative comments corresponding to pharmacy student performance. Given the accuracy of this preliminary validation for flagging preceptor comments, there are numerous implications when considering the use of sentiment analysis in pharmacy education. Using a sentiment analysis program minimizes the number of qualitative preceptor comments needing review by experiential faculty, as this program can aid in identifying students at risk of progression issues.
Collapse
|
42
|
Block-level dependency syntax based model for end-to-end aspect-based sentiment analysis. Neural Netw 2023; 166:225-235. [PMID: 37515902 DOI: 10.1016/j.neunet.2023.05.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Revised: 03/27/2023] [Accepted: 05/03/2023] [Indexed: 07/31/2023]
Abstract
End-to-End aspect-based sentiment analysis (E2E-ABSA) aims to jointly extract aspect terms and identify their sentiment polarities. Although previous research has demonstrated that syntax knowledge can be beneficial for E2E-ABSA, standard syntax dependency parsing struggles to capture the block-level relation between aspect and opinion terms, which hinders the role of syntax in E2E-ABSA. To address this issue, this paper proposes a block-level dependency syntax parsing (BDEP) based model to enhance the performance of E2E-ABSA. BDEP is constructed by incorporating routine dependency syntax parsing and part-of-speech tagging, which enables the capture of block-level relations. Subsequently. the BDEP-guided interactive attention module (BDEP-IAM) is used to obtain the aspect-aware representation of each word. Finally the adaptive fusion module is leveraged to combine the semantic-syntactic representation to simultaneously extract the aspect term and identify aspect-orient sentiment polarity. The model is evaluated on five benchmark datasets, including Laptop14, Rest _ALL, Restaurant14, Restaurant15, and TWITTER, with F1 scores of 62.67%, 76.53%, 75.42%, 62.21%, and 58.03%, respectively. The results show that our model outperforms the other compared state-of-the-art (SOTA) methods on all datasets. Additionally, ablation experiments confirm the efficacy of BDEP and IAM in improving aspect-level sentiment analysis.
Collapse
|
43
|
Seeking and Providing Social Support on Twitter for Trauma and Distress During the COVID-19 Pandemic: Content and Sentiment Analysis. J Med Internet Res 2023; 25:e46343. [PMID: 37651178 PMCID: PMC10502591 DOI: 10.2196/46343] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Revised: 06/22/2023] [Accepted: 07/18/2023] [Indexed: 09/01/2023] Open
Abstract
BACKGROUND The COVID-19 pandemic can be recognized as a traumatic event that led to stressors, resulting in trauma or distress among the general population. Social support is vital in the management of these stressors, especially during a traumatic event, such as the COVID-19 pandemic. Because of the limited face-to-face interactions enforced by physical distancing regulations during the pandemic, people sought solace on social media platforms to connect with, and receive support from, one another. Hence, it is crucial to investigate the ways in which people seek and offer support on social media for mental health management. OBJECTIVE The research aimed to examine the types of social support (eg, emotional, informational, instrumental, and appraisal) sought and provided for trauma or distress on Twitter during the COVID-19 pandemic. In addition, this study aimed to gain insight into the difficulties and concerns of people during the pandemic by identifying the associations between terms representing the topics of interest related to trauma or distress and their corresponding sentiments. METHODS The study methods included content analysis to investigate the type of social support people sought for trauma or distress during the pandemic. Sentiment analysis was also performed to track the negative and positive sentiment tweets posted between January 1, 2020, and March 15, 2021. Association rule mining was used to uncover associations between terms and sentiments in tweets. In addition, the research used Kruskal-Wallis and Mann-Whitney U tests to determine whether the retweet count and like count varied based on the social support type. RESULTS Most Twitter users who indicated trauma or distress sought emotional support. Regarding sentiment, Twitter users mostly posted negative sentiment tweets, particularly in January 2021. An intriguing observation was that wearing masks could trigger and exacerbate trauma or distress. The results revealed that people mostly sought and provided emotional support on Twitter regarding difficulties with wearing masks, mental health status, financial hardships, and treatment methods for trauma or distress. In addition, tweets regarding emotional support received the most endorsements from other users, highlighting the critical role of social support in fostering a sense of community and reducing the feelings of isolation during the pandemic. CONCLUSIONS This study demonstrates the potential of social media as a platform to exchange social support during challenging times and to identify the specific concerns (eg, wearing masks and exacerbated symptoms) of individuals with self-reported trauma or distress. The findings provide insights into the types of support that were most beneficial for those struggling with trauma or distress during the pandemic and may inform policy makers and health organizations regarding better practices for pandemic response and special considerations for groups with a history of trauma or distress.
Collapse
|
44
|
SentiUrdu-1M: A large-scale tweet dataset for Urdu text sentiment analysis using weakly supervised learning. PLoS One 2023; 18:e0290779. [PMID: 37647318 PMCID: PMC10468080 DOI: 10.1371/journal.pone.0290779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 08/15/2023] [Indexed: 09/01/2023] Open
Abstract
Low-resource languages are gaining much-needed attention with the advent of deep learning models and pre-trained word embedding. Though spoken by more than 230 million people worldwide, Urdu is one such low-resource language that has recently gained popularity online and is attracting a lot of attention and support from the research community. One challenge faced by such resource-constrained languages is the scarcity of publicly available large-scale datasets for conducting any meaningful study. In this paper, we address this challenge by collecting the first-ever large-scale Urdu Tweet Dataset for sentiment analysis and emotion recognition. The dataset consists of a staggering number of 1, 140, 821 tweets in the Urdu language. Obviously, manual labeling of such a large number of tweets would have been tedious, error-prone, and humanly impossible; therefore, the paper also proposes a weakly supervised approach to label tweets automatically. Emoticons used within the tweets, in addition to SentiWordNet, are utilized to propose a weakly supervised labeling approach to categorize extracted tweets into positive, negative, and neutral categories. Baseline deep learning models are implemented to compute the accuracy of three labeling approaches, i.e., VADER, TextBlob, and our proposed weakly supervised approach. Unlike the weakly supervised labeling approach, the VADER and TextBlob put most tweets as neutral and show a high correlation between the two. This is largely attributed to the fact that these models do not consider emoticons for assigning polarity.
Collapse
|
45
|
Depression in South Korean Adolescents Captured by Text and Opinion Mining of Social Big Data. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2023; 20:6665. [PMID: 37681805 PMCID: PMC10487740 DOI: 10.3390/ijerph20176665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 08/03/2023] [Accepted: 08/24/2023] [Indexed: 09/09/2023]
Abstract
Depression in adolescence is recognized as an important social and public health issue that interferes with continued physical growth and increases the likelihood of other mental disorders. The goal of this study was to examine online documents posted by South Korean adolescents for 3 years through the text and opinion mining of collectable documents in order to capture their depression. The sample for this study was online text-based individual documents that contained depression-related words among adolescents, and these were collected from 215 social media websites in South Korea from 1 January 2012 to 31 December 2014. A sentiment lexicon was developed for adolescent depressive symptoms, and such sentiments were analyzed through opinion mining. The depressive symptoms in the present study were classified into nine categories as suggested by the Diagnostic and Statistical Manual for Mental Disorders, 5th Edition (DSM-5). The association analysis and decision tree analysis of data mining were used to build an efficient prediction model of adolescent depression. Opinion mining indicated that 15.5% were emotionally stable, 58.6% moderately stressed, and 25.9% highly distressed. Data mining revealed that the presence of depressed mood most of the day or nearly every day had the greatest effect on adolescents' depression. Social big data analysis may serve as a viable option for developing a timely response system for emotionally susceptible adolescents. The present study represents one of the first attempts to investigate depression in South Korean adolescents using text and opinion mining from three years of online documents that originally amounted to approximately 3.1 billion documents.
Collapse
|
46
|
Users' Concerns About Endometriosis on Social Media: Sentiment Analysis and Topic Modeling Study. J Med Internet Res 2023; 25:e45381. [PMID: 37581905 PMCID: PMC10466158 DOI: 10.2196/45381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 06/14/2023] [Accepted: 07/04/2023] [Indexed: 08/16/2023] Open
Abstract
BACKGROUND Endometriosis is a debilitating and difficult-to-diagnose gynecological disease. Owing to limited information and awareness, women often rely on social media platforms as a support system to engage in discussions regarding their disease-related concerns. OBJECTIVE This study aimed to apply computational techniques to social media posts to identify discussion topics about endometriosis and to identify themes that require more attention from health care professionals and researchers. We also aimed to explore whether, amid the challenging nature of the disease, there are themes within the endometriosis community that gather posts with positive sentiments. METHODS We retrospectively extracted posts from the subreddits r/Endo and r/endometriosis from January 2011 to April 2022. We analyzed 45,693 Reddit posts using sentiment analysis and topic modeling-based methods in machine learning. RESULTS Since 2011, the number of posts and comments has increased steadily. The posts were categorized into 11 categories, and the highest number of posts were related to either asking for information (Question); sharing the experiences (Rant/Vent); or diagnosing and treating endometriosis, especially surgery (Surgery related). Sentiment analysis revealed that 92.09% (42,077/45,693) of posts were associated with negative sentiments, only 2.3% (1053/45,693) expressed positive feelings, and there were no categories with more positive than negative posts. Topic modeling revealed 27 major topics, and the most popular topics were Surgery, Questions/Advice, Diagnosis, and Pain. The Survey/Research topic, which brought together most research-related posts, was the last in terms of posts. CONCLUSIONS Our study shows that posts on social media platforms can provide insights into the concerns of women with endometriosis symptoms. The analysis of the posts confirmed that women with endometriosis have to face negative emotions and pain daily. The large number of posts related to asking questions shows that women do not receive sufficient information from physicians and need community support to cope with the disease. Health care professionals should pay more attention to the symptoms and diagnosis of endometriosis, discuss these topics with patients to reduce their dissatisfaction with doctors, and contribute more to the overall well-being of women with endometriosis. Researchers should also become more involved in social media and share new science-based knowledge regarding endometriosis.
Collapse
|
47
|
Will the Relaxation of COVID-19 Control Measures Have an Impact on the Chinese Internet-Using Public? Social Media-Based Topic and Sentiment Analysis. Int J Public Health 2023; 68:1606074. [PMID: 37637486 PMCID: PMC10448249 DOI: 10.3389/ijph.2023.1606074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2023] [Accepted: 07/24/2023] [Indexed: 08/29/2023] Open
Abstract
Objective: In December 2022, the Chinese government announced the further optimization of the implementation of the prevention and control measures of COVID-19. We aimed to assess internet-using public expression and sentiment toward COVID-19 in the relaxation of control measures in China. Methods: We used a user-simulation-like web crawler to collect raw data from Sina-Weibo and then processed the raw data, including the removal of punctuation, stop words, and text segmentation. After performing the above processes, we analyzed the data in two aspects. Firstly, we used the Latent Dirichlet Allocation (LDA) model to analyze the text data and extract the theme. After that, we used sentiment analysis to reveal the sentiment trend and the geographical spatial sentiment distribution. Results: A total of five topics were extracted according to the LDA model, namely, Complete liberalization, Resource supply, Symptom, Knowledge, and Emotional Outlet. Furthermore, sentiment analysis indicates that while the percentages of positive and negative microblogs fluctuate over time, the overall quantity of positive microblogs exceeds that of negative ones. Meanwhile, the geographical dispersion of public sentiment on internet usage exhibits significant regional variations and is subject to multifarious factors such as economic conditions and demographic characteristics. Conclusion: In the face of the relaxation of COVID-19 control measures, although concerns arise among people, they continue to encourage and support each other.
Collapse
|
48
|
An Attention-Aware Long Short-Term Memory-Like Spiking Neural Model for Sentiment Analysis. Int J Neural Syst 2023; 33:2350037. [PMID: 37303084 DOI: 10.1142/s0129065723500375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
LSTM-SNP model is a recently developed long short-term memory (LSTM) network, which is inspired from the mechanisms of spiking neural P (SNP) systems. In this paper, LSTM-SNP is utilized to propose a novel model for aspect-level sentiment analysis, termed as ALS model. The LSTM-SNP model has three gates: reset gate, consumption gate and generation gate. Moreover, attention mechanism is integrated with LSTM-SNP model. The ALS model can better capture the sentiment features in the text to compute the correlation between context and aspect words. To validate the effectiveness of the ALS model for aspect-level sentiment analysis, comparison experiments with 17 baseline models are conducted on three real-life data sets. The experimental results demonstrate that the ALS model has a simpler structure and can achieve better performance compared to these baseline models.
Collapse
|
49
|
Volatility of the COVID-19 vaccine hesitancy: sentiment analysis conducted in Brazil. Front Public Health 2023; 11:1192155. [PMID: 37483947 PMCID: PMC10360403 DOI: 10.3389/fpubh.2023.1192155] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 06/16/2023] [Indexed: 07/25/2023] Open
Abstract
Background Vaccine hesitancy is a phenomenon that can interfere with the expansion of vaccination coverage and is positioned as one of the top 10 global health threats. Previous studies have explored factors that affect vaccine hesitancy, how it behaves in different locations, and the profile of individuals in which it is most present. However, few studies have analyzed the volatility of vaccine hesitancy. Objective Identify the volatility of vaccine hesitancy manifested in social media. Methods Twitter's academic application programming interface was used to retrieve all tweets in Brazilian Portuguese mentioning the COVID-19 vaccine in 3 months (October 2020, June 2021, and October 2021), retrieving 1,048,576 tweets. A sentiment analysis was performed using the Orange software with the lexicon Multilingual sentiment in Portuguese. Results The feelings associated with vaccine hesitancy were volatile within 1 month, as well as throughout the vaccination process, being positioned as a resilient phenomenon. The themes that nurture vaccine hesitancy change dynamically and swiftly and are often associated with other topics that are also affecting society. Conclusion People that manifest the vaccine hesitancy present arguments that vary in a short period of time, what demand that government strategies to mitigate vaccine hesitancy effects be agile and counteract the expressed fear, by presenting scientific arguments.
Collapse
|
50
|
Lexicon-based sentiment analysis to detect opinions and attitude towards COVID-19 vaccines on Twitter in Italy. Comput Biol Med 2023; 158:106876. [PMID: 37030266 PMCID: PMC10072979 DOI: 10.1016/j.compbiomed.2023.106876] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Revised: 02/26/2023] [Accepted: 03/30/2023] [Indexed: 04/08/2023]
Abstract
The paper proposes a methodology based on Natural Language Processing (NLP) and Sentiment Analysis (SA) to get insights into sentiments and opinions toward COVID-19 vaccination in Italy. The studied dataset consists of vaccine-related tweets published in Italy from January 2021 to February 2022. In the considered period, 353,217 tweets have been analyzed, obtained after filtering 1,602,940 tweets with the word "vaccin". A main novelty of the approach is the categorization of opinion holders in four classes, Common users, Media, Medicine, Politics, obtained by applying NLP tools, enhanced with large-scale domain-specific lexicons, on the short bios published by users themselves. Feature-based sentiment analysis is enriched with an Italian sentiment lexicon containing polarized words, expressing semantic orientation, and intensive words which give cues to identify the tone of voice of each user category. The results of the analysis highlighted an overall negative sentiment along all the considered periods, especially for the Common users, and a different attitude of opinion holders towards specific important events, such as deaths after vaccination, occurring in some days of the examined 14 months.
Collapse
|