1
|
Biswas MR, Shah Z. Extracting factors associated with vaccination from Twitter data and mapping to behavioral models. Hum Vaccin Immunother 2023; 19:2281729. [PMID: 38013461 PMCID: PMC10760324 DOI: 10.1080/21645515.2023.2281729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 11/05/2023] [Indexed: 11/29/2023] Open
Abstract
Social media platform, particularly Twitter, is a rich data source that allows monitoring of public opinions and attitudes toward vaccines.Established behavioral models like the 5C psychological antecedents model and the Health Belief Model (HBM) provide a well-structured framework for analyzing shifts in vaccine-related behavior. This study examines if the extracted data from Twitter contains valuable insights regarding public attitudes toward vaccines and can be mapped to two behavioral models. This study focuses on the Arab population, and a search was carried out on Twitter using: ' تلقيحي OR تطعيم OR تطعيمات OR لقاح OR لقاحات' for two years from January 2020 to January 2022. Then, BERTopicmodeling was applied, and several topics were extracted. Finally, the topics were manually mapped to the factors of the 5C model and HBM. 1,068,466 unique users posted 3,368,258 vaccine-related tweets in Arabic. Topic modeling generated 25 topics, which were mapped to the 15 factors of the 5C model and HBM. Among the users, 32.87%were male, and 18.06% were female. A significant 55.77% of the users were from the MENA (Middle East and North Africa) region. Twitter users were more inclined to accept vaccines when they trusted vaccine safety and effectiveness, but vaccine hesitancy increased due to conspiracy theories and misinformation. The association of topics with these theoretical frameworks reveals the availability and diversity of Twitter data that can predict behavioral change toward vaccines. It allows the preparation of timely and effective interventions for vaccination programs compared to traditional methods.
Collapse
Affiliation(s)
- Md. Rafiul Biswas
- Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, Doha, Qatar
| | - Zubair Shah
- Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, Doha, Qatar
| |
Collapse
|
2
|
Omar A, Abd El-Hafeez T. Quantum computing and machine learning for Arabic language sentiment classification in social media. Sci Rep 2023; 13:17305. [PMID: 37828056 PMCID: PMC10570340 DOI: 10.1038/s41598-023-44113-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Accepted: 10/03/2023] [Indexed: 10/14/2023] Open
Abstract
With the increasing amount of digital data generated by Arabic speakers, the need for effective and efficient document classification techniques is more important than ever. In recent years, both quantum computing and machine learning have shown great promise in the field of document classification. However, there is a lack of research investigating the performance of these techniques on the Arabic language. This paper presents a comparative study of quantum computing and machine learning for two datasets of Arabic language document classification. In the first dataset of 213,465 Arabic tweets, both classic machine learning (ML) and quantum computing approaches achieve high accuracy in sentiment analysis, with quantum computing slightly outperforming classic ML. Quantum computing completes the task in approximately 59 min, slightly faster than classic ML, which takes around 1 h. The precision, recall, and F1 score metrics indicate the effectiveness of both approaches in predicting sentiment in Arabic tweets. Classic ML achieves precision, recall, and F1 score values of 0.8215, 0.8175, and 0.8121, respectively, while quantum computing achieves values of 0.8239, 0.8199, and 0.8147, respectively. In the second dataset of 44,000 tweets, both classic ML (using the Random Forest algorithm) and quantum computing demonstrate significantly reduced processing times compared to the first dataset, with no substantial difference between them. Classic ML completes the analysis in approximately 2 min, while quantum computing takes approximately 1 min and 53 s. The accuracy of classic ML is higher at 0.9241 compared to 0.9205 for quantum computing. However, both approaches achieve high precision, recall, and F1 scores, indicating their effectiveness in accurately predicting sentiment in the dataset. Classic ML achieves precision, recall, and F1 score values of 0.9286, 0.9241, and 0.9249, respectively, while quantum computing achieves values of 0.92456, 0.9205, and 0.9214, respectively. The analysis of the metrics indicates that quantum computing approaches are effective in identifying positive instances and capturing relevant sentiment information in large datasets. On the other hand, traditional machine learning techniques exhibit faster processing times when dealing with smaller dataset sizes. This study provides valuable insights into the strengths and limitations of quantum computing and machine learning for Arabic document classification, emphasizing the potential of quantum computing in achieving high accuracy, particularly in scenarios where traditional machine learning techniques may encounter difficulties. These findings contribute to the development of more accurate and efficient document classification systems for Arabic data.
Collapse
Affiliation(s)
- Ahmed Omar
- Department of Computer Science, Faculty of Science, Minia University, EL-Minia, Egypt.
| | - Tarek Abd El-Hafeez
- Department of Computer Science, Faculty of Science, Minia University, EL-Minia, Egypt.
- Computer Science Unit, Deraya University, EL-Minia, Egypt.
| |
Collapse
|
3
|
Modi A, Shah K, Shah S, Patel S, Shah M. Sentiment Analysis of Twitter Feeds Using Flask Environment: A Superior Application of Data Analysis. ANNALS OF DATA SCIENCE 2022; 11:1-22. [PMID: 38625244 PMCID: PMC9554374 DOI: 10.1007/s40745-022-00445-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 07/25/2022] [Accepted: 08/31/2022] [Indexed: 11/07/2022]
Abstract
In this challenging world, social media plays a vital role as it is at the pinnacle of data sharing. The advancement in technology has made a huge amount of information available for data analysis and it is on the hotlist nowadays. Opinions of the people are expressed and shared across various social media platforms like Twitter, Facebook, and Instagram. Twitter is a prodigious platform containing an ample amount of data and analyzing the data is of topmost priority. One of the most widely utilized approaches for classifying an individual's emotions displayed in subjective data is sentiment analysis. Sentiment analysis is done using various algorithms of machine learning like Support Vector Machine, Naive Bayes, Long Short-Term Memory, Decision Tree Classifier, and many more, but this paper aims at the generalized way of performing Twitter sentiment analysis using flask environment. Flask environment provides various inbuilt functionalities to analyze the sentiments of text into three different categories: positive, negative, and neutral. Also, it makes API calls to the Twitter Developer account to fetch the Twitter data. After fetching and analyzing the data, the results get displayed on a webpage containing the percentage of positive, negative, and neutral tweets for a phrase in a pie chart. It displays the language analysis for the same phrase. Furthermore, the webpage calls attention to the tweets done on that phrase and reveals the details of the tweets. Considering the major industry runners of three different sectors namely Enterprises, Sports Apparel Industry, and Multimedia Industry, we have analyzed and compared sentiments of two different Multinational companies from each sector.
Collapse
Affiliation(s)
- Astha Modi
- Department of Information and Communication Technology, School of Technology, Pandit Deendayal Energy University, Gandhinagar, Gujarat 382426 India
| | - Khelan Shah
- Department of Information and Communication Technology, School of Technology, Pandit Deendayal Energy University, Gandhinagar, Gujarat 382426 India
| | - Shrey Shah
- Department of Information and Communication Technology, School of Technology, Pandit Deendayal Energy University, Gandhinagar, Gujarat 382426 India
| | - Samir Patel
- Department of Computer Science, School of Technology, Pandit Deendayal Energy University, Gandhinagar, Gujarat 382426 India
| | - Manan Shah
- Department of Chemical Engineering, School of Energy Technology, Pandit Deendayal Energy University, Gandhinagar, Gujarat 382426 India
| |
Collapse
|
4
|
Albahli S. Twitter sentiment analysis: An Arabic text mining approach based on COVID-19. Front Public Health 2022; 10:966779. [PMID: 36299761 PMCID: PMC9589219 DOI: 10.3389/fpubh.2022.966779] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2022] [Accepted: 07/29/2022] [Indexed: 01/24/2023] Open
Abstract
The 21st century has seen a lot of innovations, among which included the advancement of social media platforms. These platforms brought about interactions between people and changed how news is transmitted, with people now able to voice their opinion as opposed to before where only the reporters were speaking. Social media has become the most influential source of speech freedom and emotions on their platforms. Anyone can express emotions using social media platforms like Facebook, Twitter, Instagram, and YouTube. The raw data is increasing daily for every culture and field of life, so there is a need to process this raw data to get meaningful information. If any nation or country wants to know their people's needs, there should be mined data showing the actual meaning of the people's emotions. The COVID-19 pandemic came with many problems going beyond the virus itself, as there was mass hysteria and the spread of wrong information on social media. This problem put the whole world into turmoil and research was done to find a way to mitigate the spread of incorrect news. In this research study, we have proposed a model of detecting genuine news related to the COVID-19 pandemic in Arabic Text using sentiment-based data from Twitter for Gulf countries. The proposed sentiment analysis model uses Machine Learning and SMOTE for imbalanced dataset handling. The result showed the people in Gulf countries had a negative sentiment during COVID-19 pandemic. This work was done so government authorities can easily learn directly from people all across the world about the spread of COVID-19 and take appropriate actions in efforts to control it.
Collapse
|
5
|
A Diagnostic Model of Breast Cancer Based on Digital Mammogram Images Using Machine Learning Techniques. APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING 2022. [DOI: 10.1155/2022/3895976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Breast cancer disease is one of the most recorded cancers that lead to morbidity and maybe death among women around the world. Recent research statistics have exposed that one from 8 females in the USA and one from 10 females in Europe are contaminated by breast cancer. The challenge with this disease is how to develop a relaxed and fast diagnosing method. One of the attractive ways of early breast cancer diagnosis is based on the mammogram images analysis of the breast using a computer-aided diagnosing (CAD) tool. This paper firstly aimed to propose an efficient method for diagnosing tumors based on mammogram images of breasts using a machine learning approach. Secondly, this paper aimed to the development of a CAD software program for breast cancer diagnosing based on the proposed method in the first step. The followed step-by-step procedure of the proposed method is performed by passing the Mammographic Image Analysis Society (MIAS) through five steps of image preprocessing, image segmentation using seeded region growing (SRG) algorithm, feature extraction using different feature’s extraction classes, and important and effectiveness feature selection using the Sequential Forward Selection (SFS) technique, and finally, the Support Vector Machine (SVM) algorithm is used as a binary classifier in two classification levels. The first level classifier is used to categorize the given image as normal or abnormal while the second-level classifier is used for further classifying the abnormal image as either a malignant or benign cancer. The proposed method is studied and investigated in two phases: the training phase and the testing phase, with the MIAS dataset of mammogram images, using 70% and 30% ratios of dataset images for the training and testing sets, respectively. The practical implementation of the proposed method and the graphical user interface (GUI) CAD tool are carried out using MATLAB software. Experimental results of the proposed method have shown that the accuracy of the proposed method reached 100% in classifying images as normal and abnormal mammogram images while the classification accuracy for benign and malignant is equal to 87.1%.
Collapse
|
6
|
Saleh H, Mostafa S, Alharbi A, El-Sappagh S, Alkhalifah T. Heterogeneous Ensemble Deep Learning Model for Enhanced Arabic Sentiment Analysis. SENSORS 2022; 22:s22103707. [PMID: 35632116 PMCID: PMC9147256 DOI: 10.3390/s22103707] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Revised: 05/06/2022] [Accepted: 05/10/2022] [Indexed: 11/18/2022]
Abstract
Sentiment analysis was nominated as a hot research topic a decade ago for its increasing importance in analyzing the people’s opinions extracted from social media platforms. Although the Arabic language has a significant share of the content shared across social media platforms, its content’s sentiment analysis is still limited due to its complex morphological structures and the varieties of dialects. Traditional machine learning and deep neural algorithms have been used in a variety of studies to predict sentiment analysis. Therefore, a need of changing current mechanisms is required to increase the accuracy of sentiment analysis prediction. This paper proposed an optimized heterogeneous stacking ensemble model for enhancing the performance of Arabic sentiment analysis. The proposed model combines three different of pre-trained Deep Learning (DL) models: Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU) in conjunction with three meta-learners Logistic Regression (LR), Random Forest (RF), and Support Vector Machine (SVM) in order to enhance model’s performance for predicting Arabic sentiment analysis. The performance of the proposed model with RNN, LSTM, GRU, and the five regular ML techniques: Decision Tree (DT), LR, K-Nearest Neighbor (KNN), RF, and Naive Bayes (NB) are compared using three benchmarks Arabic dataset. Parameters of Machine Learning (ML) and DL are optimized using Grid search and KerasTuner, respectively. Accuracy, precision, recall, and f1-score were applied to evaluate the performance of the models and validate the results. The results show that the proposed ensemble model has achieved the best performance for each dataset compared with other models.
Collapse
Affiliation(s)
- Hager Saleh
- Faculty of Computers and Artificial Intelligence, South Valley University, Hurghada 84511, Egypt;
- Correspondence: (H.S.); (T.A.)
| | - Sherif Mostafa
- Faculty of Computers and Artificial Intelligence, South Valley University, Hurghada 84511, Egypt;
| | - Abdullah Alharbi
- Department of Information Technology, College of Computers and Information Technology, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia;
| | - Shaker El-Sappagh
- Faculty of Computer Science and Engineering, Galala University, Suez 435611, Egypt;
- Information Systems Department, Faculty of Computers and Artificial Intelligence, Benha University, Banha 13518, Egypt
| | - Tamim Alkhalifah
- Department of Computer, College of Science and Arts in Ar Rass, Qassim University, Buraydah 52571, Saudi Arabia
- Correspondence: (H.S.); (T.A.)
| |
Collapse
|
7
|
Srikanth J, Damodaram A, Teekaraman Y, Kuppusamy R, Thelkar AR. Sentiment Analysis on COVID-19 Twitter Data Streams Using Deep Belief Neural Networks. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:8898100. [PMID: 35535182 PMCID: PMC9077450 DOI: 10.1155/2022/8898100] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/23/2022] [Accepted: 03/16/2022] [Indexed: 01/09/2023]
Abstract
Social media is Internet-based by design, allowing people to share content quickly via electronic means. People can openly express their thoughts on social media sites such as Twitter, which can then be shared with other people. During the recent COVID-19 outbreak, public opinion analytics provided useful information for determining the best public health response. At the same time, the dissemination of misinformation, aided by social media and other digital platforms, has proven to be a greater threat to global public health than the virus itself, as the COVID-19 pandemic has shown. The public's feelings on social distancing can be discovered by analysing articulated messages from Twitter. The automated method of recognizing and classifying subjective information in text data is known as sentiment analysis. In this research work, we have proposed to use a combination of preprocessing approaches such as tokenization, filtering, stemming, and building N-gram models. Deep belief neural network (DBN) with pseudo labelling is used to classify the tweets. Top layers of the base classifiers are boosted in the pseudo labelling strategy, whereas lower levels of the base classifiers share weights for feature extraction. By introducing the pseudo boost mechanism, our suggested technique preserves the same time complexity as a DBN while achieving fast convergence to optimality. The pseudo labelling improves the performance of the classification. It extracts the keywords from the tweets with high precision. The results reveal that using the DBN classifier in conjunction with the bigram in the N-gram model outperformed other models by 90.3 percent. The proposed approach can also aid medical professionals and decision-makers in determining the best course of action for each location based on their views regarding the pandemic.
Collapse
Affiliation(s)
- Jatla Srikanth
- Department of Computer Science and Engineering, Aurora's Technological and Research Institute, Hyderabad 500098, TS, India
| | - Avula Damodaram
- School of Information Technology (SIT), JNTUH, Hyderabad 500085, TS, India
| | - Yuvaraja Teekaraman
- Department of Electronic and Electrical Engineering, The University of Sheffield, Sheffield S1 3JD, UK
| | - Ramya Kuppusamy
- Department of Electrical and Electronics Engineering, Sri Sairam College of Engineering, Bangalore 562106, India
| | - Amruth Ramesh Thelkar
- Faculty of Electrical & Computer Engineering, Jimma Institute of Technology, Jimma University, Jimma, Ethiopia
| |
Collapse
|
8
|
Analysis of Learner’s Sentiments to Evaluate Sustainability of Online Education System during COVID-19 Pandemic. SUSTAINABILITY 2022. [DOI: 10.3390/su14084529] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Education is an important domain that may be improved by analyzing the sentiments of learners and educators. Evaluating the sustainability of the education system is critical for the continuous improvement and satisfaction of the learner’s community. This research work focused on the evaluation of the effectiveness of the online education system that has been adopted during the COVID-19 pandemic. For this purpose, sentiments/reviews of learners were collected from the Twitter website regarding the education domain during COVID-19. To automate the process of evaluation, a hybrid approach was applied that used a knowledgebase of opinion words along with machine learning and boosting algorithms with n-grams (unigram, bigram, trigram and combination of all these n-grams). This automated approach helped to evaluate the transition of the education system in different circumstances. An ensemble classifier was created in combination with a customized knowledgebase using classifiers that individually performed best with each of the n-grams. Due to the imbalanced nature of the data (tweets), these operations were performed by applying the synthetic minority oversampling technique (SMOTE). The obtained results show that the use of a customized knowledgebase not only improved the performance of the individual classifiers but also produced quality results with the ensemble model. As per the observed results, the online education system was not found sustainable as the majority of the learners were badly affected due to some important aspects (health issues, lack of training and resources).
Collapse
|