1
|
Lu J, Zhang H, Xiao Y, Wang Y. An Environmental Uncertainty Perception Framework for Misinformation Detection and Spread Prediction in the COVID-19 Pandemic: Artificial Intelligence Approach. JMIR AI 2024; 3:e47240. [PMID: 38875583 PMCID: PMC11041461 DOI: 10.2196/47240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 07/30/2023] [Accepted: 12/16/2023] [Indexed: 06/16/2024]
Abstract
BACKGROUND Amidst the COVID-19 pandemic, misinformation on social media has posed significant threats to public health. Detecting and predicting the spread of misinformation are crucial for mitigating its adverse effects. However, prevailing frameworks for these tasks have predominantly focused on post-level signals of misinformation, neglecting features of the broader information environment where misinformation originates and proliferates. OBJECTIVE This study aims to create a novel framework that integrates the uncertainty of the information environment into misinformation features, with the goal of enhancing the model's accuracy in tasks such as misinformation detection and predicting the scale of dissemination. The objective is to provide better support for online governance efforts during health crises. METHODS In this study, we embraced uncertainty features within the information environment and introduced a novel Environmental Uncertainty Perception (EUP) framework for the detection of misinformation and the prediction of its spread on social media. The framework encompasses uncertainty at 4 scales of the information environment: physical environment, macro-media environment, micro-communicative environment, and message framing. We assessed the effectiveness of the EUP using real-world COVID-19 misinformation data sets. RESULTS The experimental results demonstrated that the EUP alone achieved notably good performance, with detection accuracy at 0.753 and prediction accuracy at 0.71. These results were comparable to state-of-the-art baseline models such as bidirectional long short-term memory (BiLSTM; detection accuracy 0.733 and prediction accuracy 0.707) and bidirectional encoder representations from transformers (BERT; detection accuracy 0.755 and prediction accuracy 0.728). Additionally, when the baseline models collaborated with the EUP, they exhibited improved accuracy by an average of 1.98% for the misinformation detection and 2.4% for spread-prediction tasks. On unbalanced data sets, the EUP yielded relative improvements of 21.5% and 5.7% in macro-F1-score and area under the curve, respectively. CONCLUSIONS This study makes a significant contribution to the literature by recognizing uncertainty features within information environments as a crucial factor for improving misinformation detection and spread-prediction algorithms during the pandemic. The research elaborates on the complexities of uncertain information environments for misinformation across 4 distinct scales, including the physical environment, macro-media environment, micro-communicative environment, and message framing. The findings underscore the effectiveness of incorporating uncertainty into misinformation detection and spread prediction, providing an interdisciplinary and easily implementable framework for the field.
Collapse
Affiliation(s)
- Jiahui Lu
- State Key Laboratory of Communication Content Cognition, People's Daily Online, Beijing, China
- School of New Media and Communication, Tianjin University, Tianjin, China
| | - Huibin Zhang
- School of New Media and Communication, Tianjin University, Tianjin, China
| | - Yi Xiao
- School of New Media and Communication, Tianjin University, Tianjin, China
| | - Yingyu Wang
- School of New Media and Communication, Tianjin University, Tianjin, China
| |
Collapse
|
2
|
Luo J, El Baz D, Shi L. Utilizing deep learning models for ternary classification in COVID-19 infodemic detection. Digit Health 2024; 10:20552076241284773. [PMID: 39381806 PMCID: PMC11459571 DOI: 10.1177/20552076241284773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Accepted: 09/03/2024] [Indexed: 10/10/2024] Open
Abstract
Objective To address the complexities of distinguishing truth from falsehood in the context of the COVID-19 infodemic, this paper focuses on utilizing deep learning models for infodemic ternary classification detection. Methods Eight commonly used deep learning models are employed to categorize collected records as true, false, or uncertain. These models include fastText, three models based on recurrent neural networks, two models based on convolutional neural networks, and two transformer-based models. Results Precision, recall, and F1-score metrics for each category, along with overall accuracy, are presented to establish benchmark results. Additionally, a comprehensive analysis of the confusion matrix is conducted to provide insights into the models' performance. Conclusion Given the limited availability of infodemic records and the relatively modest size of the two tested data sets, models with pretrained embeddings or simpler architectures tend to outperform their more complex counterparts. This highlights the potential efficiency of pretrained or simpler models for ternary classification in COVID-19 infodemic detection and underscores the need for further research in this area.
Collapse
Affiliation(s)
- Jia Luo
- College of Economics and Management, Beijing University of Technology, Beijing, China
- Chongqing Research Institute, Beijing University of Technology, Chongqing, China
| | - Didier El Baz
- LAAS-CNRS, Université de Toulouse, CNRS, Toulouse, France
| | - Lei Shi
- State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing, China
| |
Collapse
|
3
|
Luo J, Peng D, Shi L, El Baz D, Liu X. A comparative analysis of the COVID-19 Infodemic in English and Chinese: insights from social media textual data. Front Public Health 2023; 11:1281259. [PMID: 38035290 PMCID: PMC10686410 DOI: 10.3389/fpubh.2023.1281259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Accepted: 10/23/2023] [Indexed: 12/02/2023] Open
Abstract
The COVID-19 infodemic, characterized by the rapid spread of misinformation and unverified claims related to the pandemic, presents a significant challenge. This paper presents a comparative analysis of the COVID-19 infodemic in the English and Chinese languages, utilizing textual data extracted from social media platforms. To ensure a balanced representation, two infodemic datasets were created by augmenting previously collected social media textual data. Through word frequency analysis, the 30 most frequently occurring infodemic words are identified, shedding light on prevalent discussions surrounding the infodemic. Moreover, topic clustering analysis uncovers thematic structures and provides a deeper understanding of primary topics within each language context. Additionally, sentiment analysis enables comprehension of the emotional tone associated with COVID-19 information on social media platforms in English and Chinese. This research contributes to a better understanding of the COVID-19 infodemic phenomenon and can guide the development of strategies to combat misinformation during public health crises across different languages.
Collapse
Affiliation(s)
- Jia Luo
- College of Economics and Management, Beijing University of Technology, Beijing, China
- Chongqing Research Institute, Beijing University of Technology, Chongqing, China
| | - Daiyun Peng
- College of Economics and Management, Beijing University of Technology, Beijing, China
| | - Lei Shi
- State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing, China
- Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, Guilin, China
| | | | - Xinran Liu
- College of Economics and Management, Beijing University of Technology, Beijing, China
| |
Collapse
|
4
|
Stimpson JP, Ortega AN. Social media users' perceptions about health mis- and disinformation on social media. HEALTH AFFAIRS SCHOLAR 2023; 1:qxad050. [PMID: 38107206 PMCID: PMC10722559 DOI: 10.1093/haschl/qxad050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
This study used recently released nationally representative data with new measures on health information seeking to estimate the prevalence and predictors of adult social media users' perceptions of health mis- and disinformation on social media. Most adults who use social media perceive some (46%) or a lot (36%) of false or misleading health information on social media, but nearly one-fifth reported either none or a little (18%). More than two-thirds of participants reported that they were unable to assess social media information as true or false (67%). Our study identified certain population groups that might be a focus of future intervention work, such as participants who use social media to make decisions. The perception by social media users that false and misleading health information on social media is highly prevalent may lend greater urgency to mitigate the spread of false or misleading health information that harms public health.
Collapse
Affiliation(s)
- Jim P. Stimpson
- Peter O’Donnell Jr. School of Public Health, University of Texas Southwestern Medical Center, Dallas, TX 75390, United States
| | - Alexander N. Ortega
- Thompson School of Social Work and Public Health, University of Hawaiʻi at Mānoa, Honolulu, HI 96822, United States
| |
Collapse
|
5
|
Park E. CRNet: a multimodal deep convolutional neural network for customer revisit prediction. JOURNAL OF BIG DATA 2023; 10:1. [PMID: 36618886 PMCID: PMC9808691 DOI: 10.1186/s40537-022-00674-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Accepted: 11/29/2022] [Indexed: 06/17/2023]
Abstract
Since mobile food delivery services have become one of the essential issues for the restaurant industry, predicting customer revisits is highlighted as one of the significant academic and research topics. Considering that the use of multimodal datasets has gained notable attention from several scholars to address multiple industrial issues in our society, we introduce CRNet, a multimodal deep convolutional neural network for predicting customer revisits. We evaluated our approach using two datasets [a customer repurchase dataset (CRD) and mobile food delivery revisit dataset (MFDRD)] and two state-of-the-art multimodal deep learning models. The results showed that CRNet obtained accuracies and Fi-Scores of 0.9575 (CRD) and 0.9436 (MFDRD) and 0.9730 (CRD) and 0.9509 (MFDRD), respectively, thus achieving higher performance levels than current state-of-the-art multimodal frameworks (accuracy: 0.7417-0.9012; F1-Score: 0.7461-0.9378). Future research should aim to address other resources that can enhance the proposed framework (e.g., metadata information). Supplementary Information The online version contains supplementary material available at 10.1186/s40537-022-00674-4.
Collapse
Affiliation(s)
- Eunil Park
- Department of Interaction Science, Sungkyunkwan University, Seoul, Republic of Korea
- Department of Applied Artificial Intelligence, Sungkyunkwan University, Seoul, Republic of Korea
- Department of Human-Artificial Intelligence Interaction, Sungkyunkwan University, Seoul, Republic of Korea
| |
Collapse
|
6
|
Huang X, Wang S, Zhang M, Hu T, Hohl A, She B, Gong X, Li J, Liu X, Gruebner O, Liu R, Li X, Liu Z, Ye X, Li Z. Social media mining under the COVID-19 context: Progress, challenges, and opportunities. INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION : ITC JOURNAL 2022; 113:102967. [PMID: 36035895 PMCID: PMC9391053 DOI: 10.1016/j.jag.2022.102967] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/26/2022] [Revised: 06/17/2022] [Accepted: 08/05/2022] [Indexed: 05/21/2023]
Abstract
Social media platforms allow users worldwide to create and share information, forging vast sensing networks that allow information on certain topics to be collected, stored, mined, and analyzed in a rapid manner. During the COVID-19 pandemic, extensive social media mining efforts have been undertaken to tackle COVID-19 challenges from various perspectives. This review summarizes the progress of social media data mining studies in the COVID-19 contexts and categorizes them into six major domains, including early warning and detection, human mobility monitoring, communication and information conveying, public attitudes and emotions, infodemic and misinformation, and hatred and violence. We further document essential features of publicly available COVID-19 related social media data archives that will benefit research communities in conducting replicable and reproducible studies. In addition, we discuss seven challenges in social media analytics associated with their potential impacts on derived COVID-19 findings, followed by our visions for the possible paths forward in regard to social media-based COVID-19 investigations. This review serves as a valuable reference that recaps social media mining efforts in COVID-19 related studies and provides future directions along which the information harnessed from social media can be used to address public health emergencies.
Collapse
Affiliation(s)
- Xiao Huang
- Department of Geosciences, University of Arkansas, Fayetteville, AR 72701, USA
| | - Siqin Wang
- School of Earth Environmental Sciences, University of Queensland, Brisbane, Queensland 4076, Australia
| | - Mengxi Zhang
- Department of Nutrition and Health Science, Ball State University, Muncie, IN 47304, USA
| | - Tao Hu
- Department of Geography, Oklahoma State University, Stillwater, OK 74078, USA
| | - Alexander Hohl
- Department of Geography, The University of Utah, Salt Lake City, UT 84112, USA
| | - Bing She
- Institute for social research, University of Michigan, Ann Arbor, MI 48109, USA
| | - Xi Gong
- Department of Geography & Environmental Studies, University of New Mexico, Albuquerque, NM 87131, USA
| | - Jianxin Li
- School of Information Technology, Deakin University, Geelong, Victoria 3220, Australia
| | - Xiao Liu
- School of Information Technology, Deakin University, Geelong, Victoria 3220, Australia
| | - Oliver Gruebner
- Department of Geography, University of Zurich, Zürich CH-8006, Switzerland
| | - Regina Liu
- Department of Biology, Mercer University, Macon, GA 31207, USA
| | - Xiao Li
- Texas A&M Transportation Institute, Bryan, TX 77807, USA
| | - Zhewei Liu
- Department of Land Surveying and Geo-informatics, The Hong Kong Polytechnic University, Hung Hom, Hong Kong, China
| | - Xinyue Ye
- Department of Landscape Architecture and Urban Planning, Texas A&M University, College Station, TX 77840, USA
| | - Zhenlong Li
- Geoinformation and Big Data Research Lab, Department of Geography, University of South Carolina, Columbia, SC 29208, USA
| |
Collapse
|
7
|
Balakrishnan V, Ng WZ, Soo MC, Han GJ, Lee CJ. Infodemic and fake news - A comprehensive overview of its global magnitude during the COVID-19 pandemic in 2021: A scoping review. INTERNATIONAL JOURNAL OF DISASTER RISK REDUCTION : IJDRR 2022; 78:103144. [PMID: 35791376 PMCID: PMC9247231 DOI: 10.1016/j.ijdrr.2022.103144] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 06/22/2022] [Accepted: 06/23/2022] [Indexed: 05/04/2023]
Abstract
The spread of fake news increased dramatically during the COVID-19 pandemic worldwide. This study aims to synthesize the extant literature to understand the magnitude of this phenomenon in the wake of the pandemic in 2021, focusing on the motives and sociodemographic profiles, Artificial Intelligence (AI)-based tools developed, and the top trending topics related to fake news. A scoping review was adopted targeting articles published in five academic databases (January 2021-November 2021), resulting in 97 papers. Most of the studies were empirical in nature (N = 69) targeting the general population (N = 26) and social media users (N = 13), followed by AI-based detection tools (N = 27). Top motives for fake news sharing include low awareness, knowledge, and health/media literacy, Entertainment/Pass Time/Socialization, Altruism, and low trust in government/news media, whilst the phenomenon was more prominent among those with low education, males and younger. Machine and deep learning emerged to be the widely explored techniques in detecting fake news, whereas top topics were related to vaccine, virus, cures/remedies, treatment, and prevention. Immediate intervention and prevention efforts are needed to curb this anti-social behavior considering the world is still struggling to contain the spread of the COVID-19 virus.
Collapse
Affiliation(s)
- Vimala Balakrishnan
- Faculty of Computer Science & Information Technology, Universiti Malaya, 50603, Lembah Pantai, Kuala Lumpur, Malaysia
| | - Wei Zhen Ng
- Faculty of Computer Science & Information Technology, Universiti Malaya, 50603, Lembah Pantai, Kuala Lumpur, Malaysia
| | - Mun Chong Soo
- Faculty of Computer Science & Information Technology, Universiti Malaya, 50603, Lembah Pantai, Kuala Lumpur, Malaysia
| | - Gan Joo Han
- Faculty of Computer Science & Information Technology, Universiti Malaya, 50603, Lembah Pantai, Kuala Lumpur, Malaysia
| | - Choon Jiat Lee
- Faculty of Medicine, Universiti Malaya, 50603, Lembah Pantai, Kuala Lumpur, Malaysia
| |
Collapse
|
8
|
Álvarez-Carmona MA, Aranda R, Rodríguez-González AY, Pellegrin L, Carlos H. Classifying the Mexican epidemiological semaphore colour from the Covid-19 text Spanish news. J Inf Sci 2022. [DOI: 10.1177/01655515221100952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
This work aims to generate classification models that help determine the colour of an epidemiological semaphore (ES) by analysing online news and being better prepared for the different changes in the evolution of the pandemic. To accomplish this, we introduce Cov-NES-Mex corpus, a collection of 77,983 news (labelled with the Mexican ES system) related to Covid-19 for the 32 regions of Mexico. Also, we showed measures that describe the corpus as imbalanced and with a high vocabulary overlap between classes. In addition, evaluation measurements of the pandemic by region are proposed. Furthermore, a classification model, based on a transformer architecture specialised for the Spanish language, achieved up to 0.83 of F-measure. Thus, this work provides evidence that there is essential information in the news that can be used to determine the colour of the ES up to 4 weeks in advance. Finally, the presented results could be applied to other Spanish-speaking countries, which do not have an ES system, thus inferring and comparing their situation concerning the Mexican ES.
Collapse
Affiliation(s)
- Miguel A Álvarez-Carmona
- Cátedras Conacyt - Centro de Investigacion Cientifica y de Educacion Superior de Ensenada, Mexico
| | - Ramón Aranda
- Cátedras Conacyt - Centro de Investigacion Cientifica y de Educacion Superior de Ensenada, Mexico
| | | | | | - Hugo Carlos
- Cátedras Conacyt - Centro de Investigacion en Ciencias de Información Geoespacial, Mexico
| |
Collapse
|
9
|
Wan M, Su Q, Xiang R, Huang CR. Data-driven analytics of COVID-19 'infodemic'. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS 2022; 15:313-327. [PMID: 35730040 PMCID: PMC9194350 DOI: 10.1007/s41060-022-00339-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Accepted: 04/30/2022] [Indexed: 12/02/2022]
Abstract
The rampant of COVID-19 infodemic has almost been simultaneous with the outbreak of the pandemic. Many concerted efforts are made to mitigate its negative effect to information credibility and data legitimacy. Existing work mainly focuses on fact-checking algorithms or multi-class labeling models that are less aware of the intrinsic characteristics of the language. Nor is it discussed how such representations can account for the common psycho-socio-behavior of the information consumers. This work takes a data-driven analytical approach to (1) describe the prominent lexical and grammatical features of COVID-19 misinformation; (2) interpret the underlying (psycho-)linguistic triggers in terms of sentiment, power and activity based on the affective control theory; (3) study the feature indexing for anti-infodemic modeling. The results show distinct language generalization patterns of misinformation of favoring evaluative terms and multimedia devices in delivering a negative sentiment. Such appeals are effective to arouse people's sympathy toward the vulnerable community and foment their spreading behavior.
Collapse
Affiliation(s)
- Minyu Wan
- Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong, China
| | - Qi Su
- School of Foreign Languages, Peking University, Beijing, China
| | - Rong Xiang
- Department of Computing, The Hong Kong Polytechnic University, Hong Kong, China
| | - Chu-Ren Huang
- Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong, China
| |
Collapse
|