1
|
Zhang HQ, Liu SH, Li R, Yu JW, Ye DX, Yuan SS, Lin H, Huang CB, Tang H. MIBPred: Ensemble Learning-Based Metal Ion-Binding Protein Classifier. ACS OMEGA 2024; 9:8439-8447. [PMID: 38405489 PMCID: PMC10882704 DOI: 10.1021/acsomega.3c09587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 01/16/2024] [Accepted: 01/22/2024] [Indexed: 02/27/2024]
Abstract
In biological organisms, metal ion-binding proteins participate in numerous metabolic activities and are closely associated with various diseases. To accurately predict whether a protein binds to metal ions and the type of metal ion-binding protein, this study proposed a classifier named MIBPred. The classifier incorporated advanced Word2Vec technology from the field of natural language processing to extract semantic features of the protein sequence language and combined them with position-specific score matrix (PSSM) features. Furthermore, an ensemble learning model was employed for the metal ion-binding protein classification task. In the model, we independently trained XGBoost, LightGBM, and CatBoost algorithms and integrated the output results through an SVM voting mechanism. This innovative combination has led to a significant breakthrough in the predictive performance of our model. As a result, we achieved accuracies of 95.13% and 85.19%, respectively, in predicting metal ion-binding proteins and their types. Our research not only confirms the effectiveness of Word2Vec technology in extracting semantic information from protein sequences but also highlights the outstanding performance of the MIBPred classifier in the problem of metal ion-binding protein types. This study provides a reliable tool and method for the in-depth exploration of the structure and function of metal ion-binding proteins.
Collapse
Affiliation(s)
- Hong-Qi Zhang
- School
of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of
China, Chengdu 610054, China
| | - Shang-Hua Liu
- School
of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of
China, Chengdu 610054, China
| | - Rui Li
- School
of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of
China, Chengdu 610054, China
| | - Jun-Wen Yu
- School
of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of
China, Chengdu 610054, China
| | - Dong-Xin Ye
- School
of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of
China, Chengdu 610054, China
| | - Shi-Shi Yuan
- School
of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of
China, Chengdu 610054, China
| | - Hao Lin
- School
of Life Science and Technology and Center for Informational Biology, University of Electronic Science and Technology of
China, Chengdu 610054, China
| | - Cheng-Bing Huang
- School
of Computer Science and Technology, Aba Teachers University, Aba 623002, China
| | - Hua Tang
- School
of Basic Medical Sciences, Southwest Medical
University, Luzhou 646000, China
- Central
Nervous System Drug Key Laboratory of Sichuan Province, Luzhou 646000, China
| |
Collapse
|
2
|
Li L, Zhou J, Zhuang J, Zhang Q. Gender-specific emotional characteristics of crisis communication on social media: Case studies of two public health crises. Inf Process Manag 2023. [DOI: 10.1016/j.ipm.2023.103299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2023]
|
3
|
Dupuy-Zini A, Audeh B, Gérardin C, Duclos C, Gagneux-Brunon A, Bousquet C. Users' Reactions to Announced Vaccines Against COVID-19 Before Marketing in France: Analysis of Twitter Posts. J Med Internet Res 2023; 25:e37237. [PMID: 36596215 PMCID: PMC10132828 DOI: 10.2196/37237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 07/17/2022] [Accepted: 08/09/2022] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND Within a few months, the COVID-19 pandemic had spread to many countries and had been a real challenge for health systems all around the world. This unprecedented crisis has led to a surge of online discussions about potential cures for the disease. Among them, vaccines have been at the heart of the debates and have faced lack of confidence before marketing in France. OBJECTIVE This study aims to identify and investigate the opinions of French Twitter users on the announced vaccines against COVID-19 through sentiment analysis. METHODS This study was conducted in 2 phases. First, we filtered a collection of tweets related to COVID-19 available on Twitter from February 2020 to August 2020 with a set of keywords associated with vaccine mistrust using word embeddings. Second, we performed sentiment analysis using deep learning to identify the characteristics of vaccine mistrust. The model was trained on a hand-labeled subset of 4548 tweets. RESULTS A set of 69 relevant keywords were identified as the semantic concept of the word "vaccin" (vaccine in French) and focused mainly on conspiracies, pharmaceutical companies, and alternative treatments. Those keywords enabled us to extract nearly 350,000 tweets in French. The sentiment analysis model achieved 0.75 accuracy. The model then predicted 16% of positive tweets, 41% of negative tweets, and 43% of neutral tweets. This allowed us to explore the semantic concepts of positive and negative tweets and to plot the trends of each sentiment. The main negative rhetoric identified from users' tweets was that vaccines are perceived as having a political purpose and that COVID-19 is a commercial argument for the pharmaceutical companies. CONCLUSIONS Twitter might be a useful tool to investigate the arguments for vaccine mistrust because it unveils political criticism contrasting with the usual concerns on adverse drug reactions. As the opposition rhetoric is more consistent and more widely spread than the positive rhetoric, we believe that this research provides effective tools to help health authorities better characterize the risk of vaccine mistrust.
Collapse
Affiliation(s)
- Alexandre Dupuy-Zini
- Laboratoire d'Informatique Médicale et d'Ingénierie des connaissances en e-Santé, LIMICS, Sorbonne Université, Université Sorbonne Paris Nord, Institut national de la santé et de la recherche médicale, INSERM, Paris, France
| | - Bissan Audeh
- Laboratoire d'Informatique Médicale et d'Ingénierie des connaissances en e-Santé, LIMICS, Sorbonne Université, Université Sorbonne Paris Nord, Institut national de la santé et de la recherche médicale, INSERM, Paris, France
| | - Christel Gérardin
- Institut Pierre Louis d'Epidémiologie et de Santé Publique, Département de médecine interne, Sorbonne Université, Paris, France
| | - Catherine Duclos
- Laboratoire d'Informatique Médicale et d'Ingénierie des connaissances en e-Santé, LIMICS, Sorbonne Université, Université Sorbonne Paris Nord, Institut national de la santé et de la recherche médicale, INSERM, Paris, France
| | - Amandine Gagneux-Brunon
- Groupe sur l'Immunité des Muqueuses et Agents Pathogènes, Centre International de Recherche en Infectiologie, University of Lyon, Saint Etienne, France
- Vaccinologie, Centre Hospitalier Universitaire de Saint-Etienne, Saint Etienne, France
| | - Cedric Bousquet
- Laboratoire d'Informatique Médicale et d'Ingénierie des connaissances en e-Santé, LIMICS, Sorbonne Université, Université Sorbonne Paris Nord, Institut national de la santé et de la recherche médicale, INSERM, Paris, France
- Service de santé publique et information médicale, Centre Hospitalier Universitaire de Saint Etienne, Saint Etienne, France
| |
Collapse
|
4
|
Chen S, Yin SJ, Guo Y, Ge Y, Janies D, Dulin M, Brown C, Robinson P, Zhang D. Content and sentiment surveillance (CSI): A critical component for modeling modern epidemics. Front Public Health 2023; 11:1111661. [PMID: 37006544 PMCID: PMC10061006 DOI: 10.3389/fpubh.2023.1111661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 02/21/2023] [Indexed: 03/18/2023] Open
Abstract
Comprehensive surveillance systems are the key to provide accurate data for effective modeling. Traditional symptom-based case surveillance has been joined with recent genomic, serologic, and environment surveillance to provide more integrated disease surveillance systems. A major gap in comprehensive disease surveillance is to accurately monitor potential population behavioral changes in real-time. Population-wide behaviors such as compliance with various interventions and vaccination acceptance significantly influence and drive the overall epidemic dynamics in the society. Original infoveillance utilizes online query data (e.g., Google and Wikipedia search of a specific content topic such as an epidemic) and later focuses on large volumes of online discourse data about the from social media platforms and further augments epidemic modeling. It mainly uses number of posts to approximate public awareness of the disease, and further compares with observed epidemic dynamics for better projection. The current COVID-19 pandemic shows that there is an urgency to further harness the rich, detailed content and sentiment information, which can provide more accurate and granular information on public awareness and perceptions toward multiple aspects of the disease, especially various interventions. In this perspective paper, we describe a novel conceptual analytical framework of content and sentiment infoveillance (CSI) and integration with epidemic modeling. This CSI framework includes data retrieval and pre-processing; information extraction via natural language processing to identify and quantify detailed time, location, content, and sentiment information; and integrating infoveillance with common epidemic modeling techniques of both mechanistic and data-driven methods. CSI complements and significantly enhances current epidemic models for more informed decision by integrating behavioral aspects from detailed, instantaneous infoveillance from massive social media data.
Collapse
Affiliation(s)
- Shi Chen
- Department of Public Health Sciences, College of Health and Human Services, University of North Carolina at Charlotte, Charlotte, NC, United States
- School of Data Science, University of North Carolina at Charlotte, Charlotte, NC, United States
- Academy for Population Health Innovation, University of North Carolina at Charlotte, Charlotte, NC, United States
| | - Shuhua Jessica Yin
- Department of Software and Information Systems, College of Computing and Informatics, University of North Carolina at Charlotte, Charlotte, NC, United States
| | - Yuqi Guo
- School of Data Science, University of North Carolina at Charlotte, Charlotte, NC, United States
- School of Social Work, College of Health and Human Services, University of North Carolina at Charlotte, Charlotte, NC, United States
| | - Yaorong Ge
- Department of Software and Information Systems, College of Computing and Informatics, University of North Carolina at Charlotte, Charlotte, NC, United States
| | - Daniel Janies
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, Charlotte, NC, United States
| | - Michael Dulin
- Department of Public Health Sciences, College of Health and Human Services, University of North Carolina at Charlotte, Charlotte, NC, United States
- Academy for Population Health Innovation, University of North Carolina at Charlotte, Charlotte, NC, United States
| | - Cheryl Brown
- School of Data Science, University of North Carolina at Charlotte, Charlotte, NC, United States
- Department of Political Science and Public Administration, College of Liberal Arts and Sciences, University of North Carolina at Charlotte, Charlotte, NC, United States
| | - Patrick Robinson
- Department of Public Health Sciences, College of Health and Human Services, University of North Carolina at Charlotte, Charlotte, NC, United States
- Academy for Population Health Innovation, University of North Carolina at Charlotte, Charlotte, NC, United States
| | - Dongsong Zhang
- School of Data Science, University of North Carolina at Charlotte, Charlotte, NC, United States
- Belk College of Business, University of North Carolina at Charlotte, Charlotte, NC, United States
| |
Collapse
|
5
|
Chai Y, Palacios J, Wang J, Fan Y, Zheng S. Measuring daily-life fear perception change: A computational study in the context of COVID-19. PLoS One 2022; 17:e0278322. [PMID: 36548306 PMCID: PMC9779044 DOI: 10.1371/journal.pone.0278322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2021] [Accepted: 11/15/2022] [Indexed: 12/24/2022] Open
Abstract
COVID-19, as a global health crisis, has triggered the fear emotion with unprecedented intensity. Besides the fear of getting infected, the outbreak of COVID-19 also created significant disruptions in people's daily life and thus evoked intensive psychological responses indirect to COVID-19 infections. In this study, we construct a panel expressed fear database tracking the universe of social media posts (16 million) generated by 536 thousand individuals between January 1st, 2019 and August 31st, 2020 in China. We employ deep learning techniques to detect expressions of fear emotion within each post, and then apply topic model to extract the major topics of fear expressions in our sample during the COVID-19 pandemic. Our unique database includes a comprehensive list of topics, not being limited to post centering around COVID-19. Based on this database, we find that sleep disorders ("nightmare" and "insomnia") take up the largest share of fear-labeled posts in the pre-pandemic period (January 2019-December 2019), and significantly increase during the COVID-19. We identify health and work-related concerns are the two major sources of non-COVID fear during the pandemic period. We also detect gender differences, with females having higher fear towards health topics and males towards monetary concerns. Our research shows how applying fear detection and topic modeling techniques on posts unrelated to COVID-19 can provide additional policy value in discerning broader societal concerns during this COVID-19 crisis.
Collapse
Affiliation(s)
- Yuchen Chai
- Department of Urban Studies and Planning, Massachusetts Institute of Technology, Cambridge, MA, United States of America
| | - Juan Palacios
- Department of Urban Studies and Planning, Massachusetts Institute of Technology, Cambridge, MA, United States of America
| | - Jianghao Wang
- Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing, China
| | - Yichun Fan
- Department of Urban Studies and Planning, Massachusetts Institute of Technology, Cambridge, MA, United States of America
| | - Siqi Zheng
- Department of Urban Studies and Planning, Massachusetts Institute of Technology, Cambridge, MA, United States of America,* E-mail:
| |
Collapse
|
6
|
Gao H, Zhao Q, Ning C, Guo D, Wu J, Li L. Does the COVID-19 Vaccine Still Work That "Most of the Confirmed Cases Had Been Vaccinated"? A Content Analysis of Vaccine Effectiveness Discussion on Sina Weibo during the Outbreak of COVID-19 in Nanjing. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 19:241. [PMID: 35010501 PMCID: PMC8750531 DOI: 10.3390/ijerph19010241] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 12/22/2021] [Accepted: 12/24/2021] [Indexed: 01/19/2023]
Abstract
In July 2021, breakthrough cases were reported in the outbreak of COVID-19 in Nanjing, sparking concern and discussion about the vaccine's effectiveness and becoming a trending topic on Sina Weibo. In order to explore public attitudes towards the COVID-19 vaccine and their emotional orientations, we collected 1542 posts under the trending topic through data mining. We set up four categories of attitudes towards COVID-19 vaccines, and used a big data analysis tool to code and manually checked the coding results to complete the content analysis. The results showed that 45.14% of the Weibo posts (n = 1542) supported the COVID-19 vaccine, 12.97% were neutral, and 7.26% were doubtful, which indicated that the public did not question the vaccine's effectiveness due to the breakthrough cases in Nanjing. There were 66.47% posts that reflected significant negative emotions. Among these, 50.44% of posts with negative emotions were directed towards the media, 25.07% towards the posting users, and 11.51% towards the public, which indicated that the negative emotions were not directed towards the COVID-19 vaccine. External sources outside the vaccine might cause vaccine hesitancy. Public opinions expressed in online media reflect the public's cognition and attitude towards vaccines and their core needs in terms of information. Therefore, online public opinion monitoring could be an essential way to understand the opinions and attitudes towards public health issues.
Collapse
Affiliation(s)
- Hao Gao
- School of Journalism and Communication, Nanjing Normal University, Nanjing 210097, China; (H.G.); (Q.Z.); (D.G.)
| | - Qingting Zhao
- School of Journalism and Communication, Nanjing Normal University, Nanjing 210097, China; (H.G.); (Q.Z.); (D.G.)
| | - Chuanlin Ning
- School of Media and Communication, Shanghai Jiao Tong University, Shanghai 200240, China;
| | - Difan Guo
- School of Journalism and Communication, Nanjing Normal University, Nanjing 210097, China; (H.G.); (Q.Z.); (D.G.)
| | - Jing Wu
- Faculty of Social Sciences, University of Ljubljana, 1000 Ljubljana, Slovenia;
| | - Lina Li
- Film-Television and Communication College, Shanghai Normal University, Shanghai 200234, China
| |
Collapse
|
7
|
Wang AW, Lan JY, Wang MH, Yu C. The Evolution of Rumors on a Closed Social Networking Platform During COVID-19: Algorithm Development and Content Study. JMIR Med Inform 2021; 9:e30467. [PMID: 34623954 PMCID: PMC8612313 DOI: 10.2196/30467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Revised: 06/29/2021] [Accepted: 09/10/2021] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND In 2020, the COVID-19 pandemic put the world in a crisis regarding both physical and psychological health. Simultaneously, a myriad of unverified information flowed on social media and online outlets. The situation was so severe that the World Health Organization identified it as an infodemic in February 2020. OBJECTIVE The aim of this study was to examine the propagation patterns and textual transformation of COVID-19-related rumors on a closed social media platform. METHODS We obtained a data set of suspicious text messages collected on Taiwan's most popular instant messaging platform, LINE, between January and July 2020. We proposed a classification-based clustering algorithm that could efficiently cluster messages into groups, with each group representing a rumor. For ease of understanding, a group is referred to as a "rumor group." Messages in a rumor group could be identical or could have limited textual differences between them. Therefore, each message in a rumor group is a form of the rumor. RESULTS A total of 936 rumor groups with at least 10 messages each were discovered among 114,124 text messages collected from LINE. Among 936 rumors, 396 (42.3%) were related to COVID-19. Of the 396 COVID-19-related rumors, 134 (33.8%) had been fact-checked by the International Fact-Checking Network-certified agencies in Taiwan and determined to be false or misleading. By studying the prevalence of simplified Chinese characters or phrases in the messages that originated in China, we found that COVID-19-related messages, compared to non-COVID-19-related messages, were more likely to have been written by non-Taiwanese users. The association was statistically significant, with P<.001, as determined by the chi-square independence test. The qualitative investigations of the three most popular COVID-19 rumors revealed that key authoritative figures, mostly medical personnel, were often misquoted in the messages. In addition, these rumors resurfaced multiple times after being fact-checked, usually preceded by major societal events or textual transformations. CONCLUSIONS To fight the infodemic, it is crucial that we first understand why and how a rumor becomes popular. While social media has given rise to an unprecedented number of unverified rumors, it also provides a unique opportunity for us to study the propagation of rumors and their interactions with society. Therefore, we must put more effort into these areas.
Collapse
Affiliation(s)
- Andrea W Wang
- Information Operations Research Group, Taipei, Taiwan
| | - Jo-Yu Lan
- Department of Information Engineering and Computer Science, Feng Chia University, Taichung, Taiwan
| | - Ming-Hung Wang
- Department of Computer Science and Information Engineering, National Chung Cheng University, Chiayi, Taiwan
| | - Chihhao Yu
- Information Operations Research Group, Taipei, Taiwan
| |
Collapse
|
8
|
Chen S, Paul R, Janies D, Murphy K, Feng T, Thill JC. Exploring Feasibility of Multivariate Deep Learning Models in Predicting COVID-19 Epidemic. Front Public Health 2021; 9:661615. [PMID: 34291025 PMCID: PMC8287417 DOI: 10.3389/fpubh.2021.661615] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Accepted: 05/20/2021] [Indexed: 11/13/2022] Open
Abstract
Background: Mathematical models are powerful tools to study COVID-19. However, one fundamental challenge in current modeling approaches is the lack of accurate and comprehensive data. Complex epidemiological systems such as COVID-19 are especially challenging to the commonly used mechanistic model when our understanding of this pandemic rapidly refreshes. Objective: We aim to develop a data-driven workflow to extract, process, and develop deep learning (DL) methods to model the COVID-19 epidemic. We provide an alternative modeling approach to complement the current mechanistic modeling paradigm. Method: We extensively searched, extracted, and annotated relevant datasets from over 60 official press releases in Hubei, China, in 2020. Multivariate long short-term memory (LSTM) models were developed with different architectures to track and predict multivariate COVID-19 time series for 1, 2, and 3 days ahead. As a comparison, univariate LSTMs were also developed to track new cases, total cases, and new deaths. Results: A comprehensive dataset with 10 variables was retrieved and processed for 125 days in Hubei. Multivariate LSTM had reasonably good predictability on new deaths, hospitalization of both severe and critical patients, total discharges, and total monitored in hospital. Multivariate LSTM showed better results for new and total cases, and new deaths for 1-day-ahead prediction than univariate counterparts, but not for 2-day and 3-day-ahead predictions. Besides, more complex LSTM architecture seemed not to increase overall predictability in this study. Conclusion: This study demonstrates the feasibility of DL models to complement current mechanistic approaches when the exact epidemiological mechanisms are still under investigation.
Collapse
Affiliation(s)
- Shi Chen
- Department of Public Health Sciences, University of North Carolina at Charlotte, Charlotte, NC, United States
- School of Data Science, University of North Carolina at Charlotte, Charlotte, NC, United States
| | - Rajib Paul
- Department of Public Health Sciences, University of North Carolina at Charlotte, Charlotte, NC, United States
- School of Data Science, University of North Carolina at Charlotte, Charlotte, NC, United States
| | - Daniel Janies
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, United States
| | - Keith Murphy
- Department of Public Health Sciences, University of North Carolina at Charlotte, Charlotte, NC, United States
| | - Tinghao Feng
- Department of Computer Science, University of North Carolina at Charlotte, Charlotte, NC, United States
| | - Jean-Claude Thill
- School of Data Science, University of North Carolina at Charlotte, Charlotte, NC, United States
- Department of Geography and Earth Sciences, University of North Carolina at Charlotte, Charlotte, NC, United States
| |
Collapse
|