Weighted Joint Sentiment-Topic Model for Sentiment Analysis Compared to ALGA: Adaptive Lexicon Learning Using Genetic Algorithm.
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022;
2022:7612276. [PMID:
35965748 PMCID:
PMC9374039 DOI:
10.1155/2022/7612276]
[Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Revised: 04/14/2022] [Accepted: 05/08/2022] [Indexed: 11/24/2022]
Abstract
Latent Dirichlet Allocation (LDA) is an approach to unsupervised learning that aims to
investigate the semantics among words in a document as well as the influence of a subject
on a word. As an LDA-based model, Joint Sentiment-Topic (JST) examines the impact of
topics and emotions on words. The emotion parameter is insufficient, and additional
parameters may play valuable roles in achieving better performance. In this study, two new
topic models, Weighted Joint Sentiment-Topic (WJST) and Weighted Joint Sentiment-Topic 1
(WJST1), have been presented to extend and improve JST through two new parameters that can
generate a sentiment dictionary. In the proposed methods, each word in a document affects
its neighbors, and different words in the document may be affected simultaneously by
several neighbor words. Therefore, proposed models consider the effect of words on each
other, which, from our view, is an important factor and can increase the performance of
baseline methods. Regarding evaluation results, the new parameters have an immense effect
on model accuracy. While not requiring labeled data, the proposed methods are more
accurate than discriminative models such as SVM and logistic regression in accordance with
evaluation results. The proposed methods are simple with a low number of parameters. While
providing a broad perception of connections between different words in documents of a
single collection (single-domain) or multiple collections (multidomain), the proposed
methods have prepared solutions for two different situations (single-domain and
multidomain). WJST is suitable for multidomain datasets, and WJST1 is a version of WJST
which is suitable for single-domain datasets. While being able to detect emotion at the
level of the document, the proposed models improve the evaluation outcomes of the baseline
approaches. Thirteen datasets with different sizes have been used in implementations. In
this study, perplexity, opinion mining at the level of the document, and
topic_coherency are employed for assessment. Also, a statistical test called Friedman
test is used to check whether the results of the proposed models are statistically
different from the results of other algorithms. As can be seen from results, the accuracy
of proposed methods is above 80% for most of the datasets. WJST1 achieves the highest
accuracy on Movie dataset with 97 percent, and WJST achieves the highest accuracy on
Electronic dataset with 86 percent. The proposed models obtain better results compared to
Adaptive Lexicon learning using Genetic Algorithm (ALGA), which employs an evolutionary
approach to make an emotion dictionary. Results show that the proposed methods perform
better with different topic number settings, especially for WJST1 with 97% accuracy
at |Z| = 5 on the Movie dataset.
Collapse