1
|
Tang X, Zhou H, Li S. Predictable by publication: discovery of early highly cited academic papers based on their own features. LIBRARY HI TECH 2023. [DOI: 10.1108/lht-06-2022-0305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
PurposePredicting highly cited papers can enable an evaluation of the potential of papers and the early detection and determination of academic achievement value. However, most highly cited paper prediction studies consider early citation information, so predicting highly cited papers by publication is challenging. Therefore, the authors propose a method for predicting early highly cited papers based on their own features.Design/methodology/approachThis research analyzed academic papers published in the Journal of the Association for Computing Machinery (ACM) from 2000 to 2013. Five types of features were extracted: paper features, journal features, author features, reference features and semantic features. Subsequently, the authors applied a deep neural network (DNN), support vector machine (SVM), decision tree (DT) and logistic regression (LGR), and they predicted highly cited papers 1–3 years after publication.FindingsExperimental results showed that early highly cited academic papers are predictable when they are first published. The authors’ prediction models showed considerable performance. This study further confirmed that the features of references and authors play an important role in predicting early highly cited papers. In addition, the proportion of high-quality journal references has a more significant impact on prediction.Originality/valueBased on the available information at the time of publication, this study proposed an effective early highly cited paper prediction model. This study facilitates the early discovery and realization of the value of scientific and technological achievements.
Collapse
|
2
|
Hassan SU, Aljohani NR, Tarar UI, Safder I, Sarwar R, Alelyani S, Nawaz R. Exploiting tweet sentiments in altmetrics large-scale data. J Inf Sci 2022. [DOI: 10.1177/01655515211043713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
This article aims to exploit social exchanges on scientific literature, specifically tweets, to analyse social media users’ sentiments towards publications within a research field. First, we employ the SentiStrength tool, extended with newly created lexicon terms, to classify the sentiments of 6,482,260 tweets associated with 1,083,535 publications provided by Altmetric.com. Then, we propose harmonic means-based statistical measures to generate a specialised lexicon, using positive and negative sentiment scores and frequency metrics. Next, we adopt a novel article-level summarisation approach to domain-level sentiment analysis to gauge the opinion of social media users on Twitter about the scientific literature. Last, we propose and employ an aspect-based analytical approach to mine users’ expressions relating to various aspects of the article, such as tweets on its title, abstract, methodology, conclusion or results section. We show that research communities exhibit dissimilar sentiments towards their respective fields. The analysis of the field-wise distribution of article aspects shows that in Medicine, Economics, Business and Decision Sciences, tweet aspects are focused on the results section. In contrast, in Physics and Astronomy, Materials Sciences and Computer Science, these aspects are focused on the methodology section. Overall, the study helps us to understand the sentiments of online social exchanges of the scientific community on scientific literature. Specifically, such a fine-grained analysis may help research communities in improving their social media exchanges about the scientific articles to disseminate their scientific findings effectively and to further increase their societal impact.
Collapse
Affiliation(s)
- Saeed-Ul Hassan
- Department of Computer Science, Information Technology University, Pakistan
| | - Naif Radi Aljohani
- Faculty of Computing and Information Technology, King Abdulaziz University, Saudi Arabia
| | - Usman Iqbal Tarar
- Department of Computer Science, Information Technology University, Pakistan
| | - Iqra Safder
- FAST School of Computing, FAST-NU Lahore, Pakistan
| | - Raheem Sarwar
- Department of Operations, Technology, Events and Hospitality Management, Manchester Metropolitan University, United Kingdom
| | - Salem Alelyani
- Center for Artificial Intelligence (CAI), King Khalid University, Saudi Arabia; College of Computer Science, King Khalid University, Saudi Arabia
| | | |
Collapse
|
3
|
Macedo JB, Ramos PMS, Maior CBS, Moura MJC, Lins ID, Vilela RFT. Identifying low-quality patterns in accident reports from textual data. INTERNATIONAL JOURNAL OF OCCUPATIONAL SAFETY AND ERGONOMICS 2022:1-13. [PMID: 35980110 DOI: 10.1080/10803548.2022.2111847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
Abstract
Accident investigation reports provide useful knowledge to support companies to propose preventive and mitigative measures. However, the information presented in accident report databases is normally large, complex, filled with errors and has missing and/or redundant data. In this article, we propose text mining and natural language processing techniques to investigate low-quality accident reports. We adopted machine learning (ML) to detect and investigate inconsistencies on accident reports. The methodology was applied to 626 documents collected from an actual hydroelectric power company. The initial ML performances indicated data divergences and concerns related to the report structure. Then, the accident database was restructured to a more proper form confirming the supposition about the quality of the reports investigated. The proposed approach can be used as a diagnostic tool to improve the design of accident investigation reports to provide a more useful source of knowledge to support decisions in the safety context.
Collapse
Affiliation(s)
- July B Macedo
- CEERMA - Center for Risk Analysis, Reliability Engineering and Environmental Modeling, Federal University of Pernambuco, Brazil.,Department of Production Engineering, Federal University of Pernambuco, Brazil
| | - Plinio M S Ramos
- CEERMA - Center for Risk Analysis, Reliability Engineering and Environmental Modeling, Federal University of Pernambuco, Brazil.,Department of Production Engineering, Federal University of Pernambuco, Brazil
| | - Caio B S Maior
- CEERMA - Center for Risk Analysis, Reliability Engineering and Environmental Modeling, Federal University of Pernambuco, Brazil.,Technology Center, Universidade Federal de Pernambuco, Brazil
| | - Márcio J C Moura
- CEERMA - Center for Risk Analysis, Reliability Engineering and Environmental Modeling, Federal University of Pernambuco, Brazil.,Department of Production Engineering, Federal University of Pernambuco, Brazil
| | - Isis D Lins
- CEERMA - Center for Risk Analysis, Reliability Engineering and Environmental Modeling, Federal University of Pernambuco, Brazil.,Department of Production Engineering, Federal University of Pernambuco, Brazil
| | | |
Collapse
|
4
|
Sotudeh H, Saber Z, Ghanbari Aloni F, Mirzabeigi M, Khunjush F. A longitudinal study of the evolution of opinions about open access and its main features: a twitter sentiment analysis. Scientometrics 2022. [DOI: 10.1007/s11192-022-04502-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
5
|
Das N, Sadhukhan B, Chatterjee T, Chakrabarti S. Effect of public sentiment on stock market movement prediction during the COVID-19 outbreak. SOCIAL NETWORK ANALYSIS AND MINING 2022; 12:92. [PMID: 35911484 PMCID: PMC9325657 DOI: 10.1007/s13278-022-00919-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 06/27/2022] [Accepted: 07/04/2022] [Indexed: 11/28/2022]
Abstract
Forecasting the stock market is one of the most difficult undertakings in the financial industry due to its complex, volatile, noisy, and nonparametric character. However, as computer science advances, an intelligent model can help investors and analysts minimize investment risk. Public opinion on social media and other online portals is an important factor in stock market predictions. The COVID-19 pandemic stimulates online activities since individuals are compelled to remain at home, bringing about a massive quantity of public opinion and emotion. This research focuses on stock market movement prediction with public sentiments using the long short-term memory network (LSTM) during the COVID-19 flare-up. Here, seven different sentiment analysis tools, VADER, logistic regression, Loughran–McDonald, Henry, TextBlob, Linear SVC, and Stanford, are used for sentiment analysis on web scraped data from four online sources: stock-related articles headlines, tweets, financial news from "Economic Times" and Facebook comments. Predictions are made utilizing both feeling scores and authentic stock information for every one of the 28 opinion measures processed. An accuracy of 98.11% is achieved by using linear SVC to calculate sentiment ratings from Facebook comments. Thereafter, the four estimated sentiment scores from each of the seven instruments are integrated with stock data in a step-by-step fashion to determine the overall influence on the stock market. When all four sentiment scores are paired with stock data, the forecast accuracy for five out of seven tools is at its most noteworthy, with linear SVC computed scores assisting stock data to arrive at its most elevated accuracy of 98.32%.
Collapse
Affiliation(s)
- Nabanita Das
- Department of Computer Science & Engineering, Techno International New Town, Kolkata, West Bengal India
| | - Bikash Sadhukhan
- Department of Computer Science & Engineering, Techno International New Town, Kolkata, West Bengal India
| | - Tanusree Chatterjee
- Department of Computer Science & Engineering, Techno International New Town, Kolkata, West Bengal India
| | | |
Collapse
|
6
|
Verma S. Sentiment analysis of public services for smart society: Literature review and future research directions. GOVERNMENT INFORMATION QUARTERLY 2022. [DOI: 10.1016/j.giq.2022.101708] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
|
7
|
Sarwar R, Hassan SU. UrduAI: Writeprints for Urdu Authorship Identification. ACM T ASIAN LOW-RESO 2022. [DOI: 10.1145/3476467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
The authorship identification task aims at identifying the original author of an anonymous text sample from a set of candidate authors. It has several application domains such as digital text forensics and information retrieval. These application domains are not limited to a specific language. However, most of the authorship identification studies are focused on English and limited attention has been paid to Urdu. However, existing Urdu authorship identification solutions drop accuracy as the number of training samples per candidate author reduces and when the number of candidate authors increases. Consequently, these solutions are inapplicable to real-world cases. Moreover, due to the unavailability of reliable POS taggers or sentence segmenters, all existing authorship identification studies on Urdu text are limited to the word n-grams features only. To overcome these limitations, we formulate a stylometric feature space, which is not limited to the word n-grams feature only. Based on this feature space, we use an authorship identification solution that transforms each text sample into a point set, retrieves candidate text samples, and relies on the nearest neighbors classifier to predict the original author of the anonymous text sample. To evaluate our solution, we create a significantly larger corpus than existing studies and conduct several experimental studies that show that our solution can overcome the limitations of existing studies and report an accuracy level of 94.03%, which is higher than all previous authorship identification works.
Collapse
Affiliation(s)
- Raheem Sarwar
- Research Group in Computational Linguistics, Research Institute of Information and Language Processing, University of Wolverhampton, Wolverhampton, Midlands, United Kingdom
| | - Saeed-Ul Hassan
- Department of Computer Science, Information Technology University, Lahore, Punjab, Pakistan
| |
Collapse
|
8
|
Sarirete A. Sentiment analysis tracking of COVID-19 vaccine through tweets. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING 2022; 14:1-9. [PMID: 35378971 PMCID: PMC8966855 DOI: 10.1007/s12652-022-03805-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Accepted: 03/05/2022] [Indexed: 06/14/2023]
Abstract
Recent studies on the COVID-19 pandemic indicated an increase in the level of anxiety, stress, and depression among people of all ages. The World Health Organization (WHO) recently warned that even with the approval of vaccines by the Food and Drug Administration (FDA), population immunity is highly unlikely to be achieved this year. This paper aims to analyze people's sentiments during the pandemic by combining sentiment analysis and natural language processing algorithms to classify texts and extract the polarity, emotion, or consensus on COVID-19 vaccines based on tweets. The method used is based on the collection of tweets under the hashtag #COVIDVaccine while the nltk toolkit parses the texts, and the tf-idf algorithm generates the keywords. Both n-gram keywords and hashtags mentioned in the tweets are collected and counted. The results indicate that the sentiments are divided into positive and negative emotions, with the negative ones dominating.
Collapse
Affiliation(s)
- Akila Sarirete
- Computer Science Department, Effat College of Engineering, Effat University, Jeddah, Saudi Arabia
- Energy and Technology Research Center, Effat University, Jeddah, Saudi Arabia
| |
Collapse
|
9
|
Zhou M, Mou H. Tracking public opinion about online education over COVID-19 in China. EDUCATIONAL TECHNOLOGY RESEARCH AND DEVELOPMENT : ETR & D 2022; 70:1083-1104. [PMID: 35221629 PMCID: PMC8862705 DOI: 10.1007/s11423-022-10080-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 01/04/2022] [Indexed: 06/14/2023]
Abstract
Due to the novel coronavirus disease (COVID-19) outbreak in China, a large number of Chinese students resorted to online learning resources. The increasingly widespread online education enables the investigation of public opinion about this large-scale untraditional mode of learning during this critical period. Sina Weibo Microblogs (the Chinese equivalent of Twitter) related to online education were collected in three distinctive phases: from July 01, 2019 to January 09, 2020 (pre-pandemic); from January 10, 2020 to April 30, 2020 (amid-pandemic); and from May 01, 2020 to Nov 30, 2020 (post-pandemic), respectively. The aim was to obtain broad insight into how online learning was viewed by the public in the Chinese educational landscape. The public opinion during these three periods were analysed and compared. The findings facilitated a better understanding of what the Chinese public perceived about this online learning mode in becoming the dominant channel for teaching and learning during critical periods.
Collapse
Affiliation(s)
- Mingming Zhou
- Faculty of Education, University of Macau, Av. Padre Tomas Pereira, Taipa, Macau SAR China
| | - Hao Mou
- Guangzhou DataStory Information Technology Co., Ltd, Guangzhou, China
| |
Collapse
|
10
|
|
11
|
Luo J, Feliciani T, Reinhart M, Hartstein J, Das V, Alabi O, Shankar K. Analyzing sentiments in peer review reports: Evidence from two science funding agencies. QUANTITATIVE SCIENCE STUDIES 2021. [DOI: 10.1162/qss_a_00156] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
Abstract
Using a novel combination of methods and data sets from two national funding agency contexts, this study explores whether review sentiment can be used as a reliable proxy for understanding peer reviewer opinions. We measure reviewer opinions via their review sentiments on both specific review subjects and proposals’ overall funding worthiness with three different methods: manual content analysis and two dictionary-based sentiment analysis algorithms (TextBlob and VADER). The reliability of review sentiment to detect reviewer opinions is addressed by its correlation with review scores and proposals’ rankings and funding decisions. We find in our samples that review sentiments correlate with review scores or rankings positively, and the correlation is stronger for manually coded than for algorithmic results; manual and algorithmic results are overall correlated across different funding programs, review sections, languages, and agencies, but the correlations are not strong; and manually coded review sentiments can quite accurately predict whether proposals are funded, whereas the two algorithms predict funding success with moderate accuracy. The results suggest that manual analysis of review sentiments can provide a reliable proxy of grant reviewer opinions, whereas the two SA algorithms can be useful only in some specific situations.
Collapse
Affiliation(s)
- Junwen Luo
- School of Information and Communication Studies, University College Dublin, Dublin, Ireland
| | - Thomas Feliciani
- School of Sociology and Geary Institute of Public Policy, University College Dublin, Dublin, Ireland
| | - Martin Reinhart
- Robert K. Merton Center for Science Studies, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Judith Hartstein
- German Centre for Higher Education Research and Science Studies (DZHW), Berlin, Germany
- Faculty of Humanities and Social Sciences, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Vineeth Das
- School of Sociology and Geary Institute of Public Policy, University College Dublin, Dublin, Ireland
| | - Olalere Alabi
- School of Sociology and Geary Institute of Public Policy, University College Dublin, Dublin, Ireland
| | - Kalpana Shankar
- School of Information and Communication Studies, University College Dublin, Dublin, Ireland
| |
Collapse
|
12
|
Agarwal S, Chowdary CR. Combating hate speech using an adaptive ensemble learning model with a case study on COVID-19. EXPERT SYSTEMS WITH APPLICATIONS 2021; 185:115632. [PMID: 36567759 PMCID: PMC9759712 DOI: 10.1016/j.eswa.2021.115632] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Revised: 06/10/2021] [Accepted: 07/18/2021] [Indexed: 05/21/2023]
Abstract
Social media platforms generate an enormous amount of data every day. Millions of users engage themselves with the posts circulated on these platforms. Despite the social regulations and protocols imposed by these platforms, it is difficult to restrict some objectionable posts carrying hateful content. Automatic hate speech detection on social media platforms is an essential task that has not been solved efficiently despite multiple attempts by various researchers. It is a challenging task that involves identifying hateful content from social media posts. These posts may reveal hate outrageously, or they may be subjective to the user or a community. Relying on manual inspection delays the process, and the hateful content may remain available online for a long time. The current state-of-the-art methods for tackling hate speech perform well when tested on the same dataset but fail miserably on cross-datasets. Therefore, we propose an ensemble learning-based adaptive model for automatic hate speech detection, improving the cross-dataset generalization. The proposed expert model for hate speech detection works towards overcoming the strong user-bias present in the available annotated datasets. We conduct our experiments under various experimental setups and demonstrate the proposed model's efficacy on the latest issues such as COVID-19 and US presidential elections. In particular, the loss in performance observed under cross-dataset evaluation is the least among all the models. Also, while restricting the maximum number of tweets per user, we incur no drop in performance.
Collapse
Affiliation(s)
- Shivang Agarwal
- Department of Computer Science and Engineering, Indian Institute of Technology (BHU) Varanasi, 221005, India
| | - C Ravindranath Chowdary
- Department of Computer Science and Engineering, Indian Institute of Technology (BHU) Varanasi, 221005, India
| |
Collapse
|
13
|
Ortega-Bastida J, Gallego AJ, Rico-Juan JR, Albarrán P. A multimodal approach for regional GDP prediction using social media activity and historical information. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.107693] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
14
|
|
15
|
Zhang L, Wang J. What affects publications’ popularity on Twitter? Scientometrics 2021. [DOI: 10.1007/s11192-021-04152-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
16
|
Basiri ME, Nemati S, Abdar M, Asadi S, Acharrya UR. A novel fusion-based deep learning model for sentiment analysis of COVID-19 tweets. Knowl Based Syst 2021; 228:107242. [PMID: 36570870 PMCID: PMC9759659 DOI: 10.1016/j.knosys.2021.107242] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2020] [Revised: 04/30/2021] [Accepted: 06/15/2021] [Indexed: 12/27/2022]
Abstract
Undoubtedly, coronavirus (COVID-19) has caused one of the biggest challenges of all times. The ongoing COVID-19 pandemic has caused more than 150 million infected cases and one million deaths globally as of May 5, 2021. Understanding the sentiment of people expressed in their social media comments can help in monitoring, controlling, and ultimately eradicating the disease. This is a sensitive matter as the threat of infectious disease significantly affects the way people think and behave in various ways. In this study, we proposed a novel method based on the fusion of four deep learning and one classical supervised machine learning model for sentiment analysis of coronavirus-related tweets from eight countries. Also, we analyzed coronavirus-related searches using Google Trends to better understand the change in the sentiment pattern at different times and places. Our findings reveal that the coronavirus attracted the attention of people from different countries at different times in varying intensities. Also, the sentiment in their tweets is correlated to the news and events that occurred in their countries including the number of newly infected cases, number of recoveries and deaths. Moreover, common sentiment patterns can be observed in various countries during the spread of the virus. We believe that different social media platforms have great impact on raising people's awareness about the importance of this disease as well as promoting preventive measures among people in the community.
Collapse
Affiliation(s)
- Mohammad Ehsan Basiri
- Department of Computer Engineering, Shahrekord University, Shahrekord, Iran,Corresponding author
| | - Shahla Nemati
- Department of Computer Engineering, Shahrekord University, Shahrekord, Iran
| | - Moloud Abdar
- Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Australia
| | - Somayeh Asadi
- Department of Architectural Engineering, Pennsylvania State University, 104 Engineering Unit A, University Park, PA, 16802, USA
| | - U. Rajendra Acharrya
- Department of Electronics and Computer Engineering, Ngee Ann Polytechnic, Clementi, Singapore,Department Bioinformatics and Medical Engineering, Asia University, Taiwan,International Research Organization for Advanced Science and Technology (IROAST), Kumamoto University, Kumamoto, Japan
| |
Collapse
|
17
|
Nwankwo TV, Odiachi RA, Anene IA. Black articles matter: exploring relative deprivation and implicit bias in library and information science research publications of Africa and other continents. LIBRARY HI TECH 2021. [DOI: 10.1108/lht-05-2021-0164] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
PurposeThe purpose of this paper is to explore relative deprivation and implicit bias in library and information science research publications of Africa and other continents.Design/methodology/approachResearch design used for this study is descriptive survey research. Specifically, the study will adopt both web content analysis and survey to collect data. The content analysis covers the whole continents of the world: Africa, Asia, Eastern Europe, Latin America, Middle East, Northern America, Pacific Region and Western Europe; using the Webometrics World Ranking of Universities and the SCImago/Scopus Journal Ranking. Library and information science was used as the search and control parameter. The scopes covered by the research are: 1. Ascertaining the visible publishing and assessment standards of top library and information science (LIS) journals, which was evaluated using Kleinert and Wager (2010)'s study.FindingsIt was found out among others that editors making fair and unbiased decisions as policy is seen in 33% of the journals, which is very poor. All the structural disparities, such as presence ranking, impact ranking, excellence ranking, etc. were favouring Europe and the Americas mainly. As much as rejection is getting to these respondents, research generally is also suffering by missing out on some untapped knowledge and ideas from these deprived populations. Many authors are losing faith in their capabilities and are now afraid of venturing into tedious research exercises because it will most likely be rejected either ways.Research limitations/implicationsIt is an established fact that social media gains research impact and attracts international collaborations. In support, studies such as Hassan et al. (2019) reported the fact that tweet mentions of articles with positive sentiment to more visibility and citations. They claim that cited articles in either positive or neutral tweets have a more significant impact than those not cited at all or cited in negative tweets. In addition, Hassan et al. (2020) equally highlighted tweet coupling as a social media methodology useful for clustering scientific publications. Despite the fact that social media have these influences on research and publications visibility and presence, the context of the present research did cover this scope of study. The study focused mainly on sources from Scopus as well as results from responses. Further studies can be carried out on this area.Originality/valueResearch studies linking “Black Articles Matter” to relative deprivation and implicit bias in research publications, especially in library and information discipline, are very rare. Also, the scope of approach of the study is quite different and interesting.
Collapse
|
18
|
Zhang T, Lin H, Xu B, Yang L, Wang J, Duan X. Adversarial neural network with sentiment-aware attention for detecting adverse drug reactions. J Biomed Inform 2021; 123:103896. [PMID: 34487887 DOI: 10.1016/j.jbi.2021.103896] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Revised: 08/22/2021] [Accepted: 08/23/2021] [Indexed: 11/24/2022]
Abstract
Adverse drug reaction (ADR) detection is an important issue in drug safety. ADRs are health threats caused by medication. Identifying ADRs in a timely manner can reduce harm to patients and can also assist doctors in the rational use of drugs. Many studies have investigated potential ADRs based on social media due to the openness and timeliness of this resource; however, they have ignored the fine-grained emotional expression in social media text. In addition, the benchmark datasets from social media are usually small, which can result in the problem of over-fitting. In this paper, we propose the Adversarial Neural Network with Sentiment-aware Attention (ANNSA) model, which enhances the sentimental element in social media and improves the performance of neural networks via data augmentation. Specifically, a sentiment-aware attention mechanism is proposed to extract the word-level sentiment features associated with sentiment words and learn task-related information by optimizing a task-specific loss. For low-resource datasets, we use an adversarial training approach to generate perturbations of the word embeddings via an implicit regularization technique. ANNSA was tested on three social media ADR detection datasets, namely, Twitter, TwiMed (Twitter) and CADEC. The experimental results indicated the ability to achieve F1 values of 48.84%, 64.18% and 83.06%, respectively, comparable to the best results reported for state-of-the-art methods. Our study demonstrates that sentiment words are highly correlated with ADRs and that word-level sentiment features can assist in detecting ADRs from social media datasets.
Collapse
Affiliation(s)
- Tongxuan Zhang
- Tianjin Normal University, Tianjin, China; Dalian University of Technology, Dalian, China
| | - Hongfei Lin
- Dalian University of Technology, Dalian, China.
| | - Bo Xu
- Dalian University of Technology, Dalian, China
| | - Liang Yang
- Dalian University of Technology, Dalian, China
| | - Jian Wang
- Dalian University of Technology, Dalian, China
| | | |
Collapse
|
19
|
Grissette H, Nfaoui EH. Affective Concept-Based Encoding of Patient Narratives via Sentic Computing and Neural Networks. Cognit Comput 2021; 14:274-299. [PMID: 34422122 PMCID: PMC8371039 DOI: 10.1007/s12559-021-09903-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 06/23/2021] [Indexed: 11/30/2022]
Abstract
The automatic generation of features without human intervention is the most critical task for biomedical sentiment analysis. Regarding the high dynamicity of shared patient narrative data, the lack of formal medical language sentiment dictionaries prevents retrieval of the appropriate sentiment, which is unapproachable and can be prone to annotator bias. We propose a novel affective biomedical concept-based encoding via sentic computing and neural networks. The main contributions include four aspects. First, a biomedical embedding, in which a medical entity is defined, normalized, and synthesized from a text, is built using online patient narratives after being combined with label propagation from a widely used comprehensive biomedical vocabulary. Second, considering the dependence on biomedical definitions, drug reaction sample selection based on general matching is suggested. These feature settings are then used to build and recognize affective semantics and sentics based on an extreme learning machine. Finally, a semisupervised LSTM-BiLSTM model for biomedical sentiment analysis is constructed. There was a massive influx of patient self-reports related to the COVID-19 pandemic. A study was conducted in this direction, and we tested the validity, medical language familiarity, and transferability of our approach by analyzing millions of COVID-19 tweets. Comparisons to affective lexicons also indicate that integrating extreme learning machine cognitive capabilities has advantages over biomedical sentiment analysis. By considering sentics vectors on top of the formed embeddings, our semisupervised LSTM-BiLSTM achieved an accuracy of 87.5%. The evaluations of unsupervised learning approximated the results of the previous model when dealing with a serious loss of biomedical data. In this paper, we demonstrate the effectiveness of integrating deep-learning-based cognitive capabilities for both enhancing distributed biomedical definitions and inferring sentiment compositions from many patient self-reports on social networks. The relevant encoding of affective information conveyed regarding medication subjects clearly reveals defined roles and expectations that can have a positive impact on public health.
Collapse
Affiliation(s)
- Hanane Grissette
- LISAC Laboratory, Faculty of Sciences Dhar EL Mahraz, Sidi Mohamed Ben Abdellah University, Fez, Morocco
| | - El Habib Nfaoui
- LISAC Laboratory, Faculty of Sciences Dhar EL Mahraz, Sidi Mohamed Ben Abdellah University, Fez, Morocco
| |
Collapse
|
20
|
Shokouhyar S, Dehkhodaei A, Amiri B. A mixed-method approach for modelling customer-centric mobile phone reverse logistics: application of social media data. JOURNAL OF MODELLING IN MANAGEMENT 2021. [DOI: 10.1108/jm2-07-2020-0191] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Purpose
Recently, reverse logistics (RL) has become more prominent due to growing environmental concerns, social responsibility, competitive advantage and high efficiency by customers because of the expansion of product selection and shorter product life cycle. However, effective implementation of RL results in some direct advantages, the most important of which is winning customer satisfaction that is vital to a firm’s success. Therefore, paying attention to customer feedback in supply chain and logistics processes has recently increased so that manufacturers have decided to transform their RL into customer-centric RL. Hence, this paper aims to identify the features of a mobile phone which affect consumer purchasing behaviour and to analyse the interrelationship among them to develop a framework for customer-centric RL. These features are studied based on website analysis of several mobile phone manufacturers. The special focus of this paper is on social media data (Twitter) in an attempt to help the decision-making process in RL through a big data analysis approach.
Design/methodology/approach
A portfolio of mobile phone features that affect consumer’s mobile phone purchasing decisions has been taken from website analysis by several mobile phone manufacturers to achieve this objective. Then, interrelationships between the identified features have been established by using big data supplemented with interpretive structural modelling (ISM). Apart from that, cross-impact matrix multiplication, applied to classification analysis, was carried out to graphically represent these features based on their driving power and dependence.
Findings
During the study, it has been observed from the ISM that the chip (F5) is the most significant feature that affects customer’s buying behaviour; therefore, mobile phone manufacturers realize that this is to be addressed first.
Originality/value
The focus of this paper is on social media data (Twitter) so that experts can understand the interaction between mobile phone features that affect consumer’s decisions on mobile phone purchasing by using the results.
Collapse
|
21
|
The Gig Economy: Current Issues, the Debate, and the New Avenues of Research. SUSTAINABILITY 2021. [DOI: 10.3390/su13095023] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
In the context of the debate on platform economy, on the one hand, and the gig economy, on the other, this paper delineates the conceptual boundaries of both concepts to query the gig economy research included in the Web of Science database. The initial search, cutoff date February 2020, targeting “gig economy” returned a sample of 378 papers dealing with the topic. The subsequent analysis, employing the science mapping method and relating software (SciMAT), allowed to query the body of research dealing with gig economy in detail. The value added by this paper is fourfold. First, the broad literature on gig economy is mapped and the nascent synergies relating both to research opportunities and economic implications are identified and highlighted. Second, the findings reveal that while research on gig economy proliferates, the distinction between “platform” and “gig” economy frequently remains blurred in the analysis. This paper elaborates on this issue. Third, it is highlighted that the discussion on gig economy is largely dispersed and a clearer research agenda is needed to streamline the discussion to improve its exploratory and explanatory potential. This paper suggests ways of navigating this issue. Fourth, by mapping the existing research on gig economy and highlighting its caveats, the way toward a comprehensive research agenda in the field is highlighted.
Collapse
|
22
|
Alamoodi AH, Zaidan BB, Zaidan AA, Albahri OS, Mohammed KI, Malik RQ, Almahdi EM, Chyad MA, Tareq Z, Albahri AS, Hameed H, Alaa M. Sentiment analysis and its applications in fighting COVID-19 and infectious diseases: A systematic review. EXPERT SYSTEMS WITH APPLICATIONS 2021; 167:114155. [PMID: 33139966 PMCID: PMC7591875 DOI: 10.1016/j.eswa.2020.114155] [Citation(s) in RCA: 86] [Impact Index Per Article: 28.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 10/23/2020] [Accepted: 10/23/2020] [Indexed: 05/05/2023]
Abstract
The COVID-19 pandemic caused by the novel coronavirus SARS-CoV-2 occurred unexpectedly in China in December 2019. Tens of millions of confirmed cases and more than hundreds of thousands of confirmed deaths are reported worldwide according to the World Health Organisation. News about the virus is spreading all over social media websites. Consequently, these social media outlets are experiencing and presenting different views, opinions and emotions during various outbreak-related incidents. For computer scientists and researchers, big data are valuable assets for understanding people's sentiments regarding current events, especially those related to the pandemic. Therefore, analysing these sentiments will yield remarkable findings. To the best of our knowledge, previous related studies have focused on one kind of infectious disease. No previous study has examined multiple diseases via sentiment analysis. Accordingly, this research aimed to review and analyse articles about the occurrence of different types of infectious diseases, such as epidemics, pandemics, viruses or outbreaks, during the last 10 years, understand the application of sentiment analysis and obtain the most important literature findings. Articles on related topics were systematically searched in five major databases, namely, ScienceDirect, PubMed, Web of Science, IEEE Xplore and Scopus, from 1 January 2010 to 30 June 2020. These indices were considered sufficiently extensive and reliable to cover our scope of the literature. Articles were selected based on our inclusion and exclusion criteria for the systematic review, with a total of n = 28 articles selected. All these articles were formed into a coherent taxonomy to describe the corresponding current standpoints in the literature in accordance with four main categories: lexicon-based models, machine learning-based models, hybrid-based models and individuals. The obtained articles were categorised into motivations related to disease mitigation, data analysis and challenges faced by researchers with respect to data, social media platforms and community. Other aspects, such as the protocol being followed by the systematic review and demographic statistics of the literature distribution, were included in the review. Interesting patterns were observed in the literature, and the identified articles were grouped accordingly. This study emphasised the current standpoint and opportunities for research in this area and promoted additional efforts towards the understanding of this research field.
Collapse
Affiliation(s)
- A H Alamoodi
- Department of Computing, Sultan Idris University of Education (UPSI), Tanjong Malim, Malaysia
| | - B B Zaidan
- Department of Computing, Sultan Idris University of Education (UPSI), Tanjong Malim, Malaysia
- Future Technology Research Center, National Yunlin University of Science and Technology, 123 University Road, Section 3, Douliou, Yunlin 64002, Taiwan, ROC
| | - A A Zaidan
- Department of Computing, Sultan Idris University of Education (UPSI), Tanjong Malim, Malaysia
| | - O S Albahri
- Department of Computing, Sultan Idris University of Education (UPSI), Tanjong Malim, Malaysia
| | - K I Mohammed
- Department of Computing, Sultan Idris University of Education (UPSI), Tanjong Malim, Malaysia
| | - R Q Malik
- Department of Engineering Technology, Universiti Tun Hussein Onn (UTHM), Batu Pahat, Malaysia
| | - E M Almahdi
- Department of Computing, Sultan Idris University of Education (UPSI), Tanjong Malim, Malaysia
| | - M A Chyad
- Department of Computing, Sultan Idris University of Education (UPSI), Tanjong Malim, Malaysia
| | - Z Tareq
- Department of Computer Science, Computer Science and Mathematics College, Tikrit University, Tikrit 34001, Iraq
| | - A S Albahri
- Informatics Institute for Postgraduate Studies (IIPS), Iraqi Commission for Computers and Informatics (ICCI), Baghdad, Iraq
| | - Hamsa Hameed
- Faculty of Human Development, Sultan Idris University of Education (UPSI), Tanjung Malim, Malaysia
| | - Musaab Alaa
- Faculty of Languages and Communication, Sultan Idris University of Education (UPSI), Tanjong Malim, Malaysia
| |
Collapse
|
23
|
Sarwar R, Zia A, Nawaz R, Fayoumi A, Aljohani NR, Hassan SU. Webometrics: evolution of social media presence of universities. Scientometrics 2021. [DOI: 10.1007/s11192-020-03804-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
24
|
Schelkle B, Galland Q. Microbiome Research: Open Communication Today, Microbiome Applications in the Future. Microorganisms 2020; 8:E1960. [PMID: 33322055 PMCID: PMC7763060 DOI: 10.3390/microorganisms8121960] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Revised: 11/27/2020] [Accepted: 12/08/2020] [Indexed: 11/16/2022] Open
Abstract
Microbiome research has recently gained centre-stage in both basic science and translational applications, yet researchers often feel that public communication about its potential overpromises. This manuscript aims to share a perspective on how scientists can engage in more open, ethical and transparent communication using an ongoing research project on food systems microbiomes as a case study. Concrete examples of strategically planned communication efforts are outlined, which aim to inspire and empower other researchers. Finally, we conclude with a discussion on the benefits of open and transparent communication from early-on in innovation pathways, mainly increasing trust in scientific processes and thus paving the way to achieving societal milestones such as the UN Sustainable Development Goals and the EU Green Deal.
Collapse
Affiliation(s)
- Bettina Schelkle
- European Food Information Council, Rue des Deux Eglises 14, 1000 Brussels, Belgium
| | - Quentin Galland
- Hague Corporate Affairs, Rue Belliard 40, 1040 Brussels, Belgium;
| |
Collapse
|
25
|
Drongstrup D, Malik S, Aljohani NR, Alelyani S, Safder I, Hassan SU. Can social media usage of scientific literature predict journal indices of AJG, SNIP and JCR? An altmetric study of economics. Scientometrics 2020. [DOI: 10.1007/s11192-020-03613-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
26
|
Towards the Discovery of Influencers to Follow in Micro-Blogs (Twitter) by Detecting Topics in Posted Messages (Tweets). APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10165715] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Micro-blogs, such as Twitter, have become important tools to share opinions and information among users. Messages concerning any topic are daily posted. A message posted by a given user reaches all the users that decided to follow her/him. Some users post many messages, because they aim at being recognized as influencers, typically on specific topics. How a user can discover influencers concerned with her/his interest? Micro-blog apps and web sites lack a functionality to recommend users with influencers, on the basis of the content of posted messages. In this paper, we envision such a scenario and we identify the problem that constitutes the basic brick for developing a recommender of (possibly influencer) users: training a classification model by exploiting messages labeled with topical classes, so as this model can be used to classify unlabeled messages, to let the hidden topic they talk about emerge. Specifically, the paper reports the investigation activity we performed to demonstrate the suitability of our idea. To perform the investigation, we developed an investigation framework that exploits various patterns for extracting features from within messages (labeled with topical classes) in conjunction with the mostly-used classifiers for text classification problems. By means of the investigation framework, we were able to perform a large pool of experiments, that allowed us to evaluate all the combinations of feature patterns with classifiers. By means of a cost-benefit function called “Suitability”, that combines accuracy with execution time, we were able to demonstrate that a technique for discovering topics from within messages suitable for the application context is available.
Collapse
|
27
|
Sarwar R, Rutherford AT, Hassan SU, Rakthanmanon T, Nutanong S. Native Language Identification of Fluent and Advanced Non-Native Writers. ACM T ASIAN LOW-RESO 2020. [DOI: 10.1145/3383202] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Native Language Identification
(NLI) aims at identifying the
native
languages of authors by analyzing their text samples written in a
non-native
language. Most existing studies investigate this task for educational applications such as
second language acquisition
and require the learner corpora. This article performs NLI in a challenging context of the
user-generated-content
(UGC) where authors are fluent and advanced non-native speakers of a second language. Existing NLI studies with UGC (i) rely on the content-specific/social-network features and may not be generalizable to other domains and datasets, (ii) are unable to capture the variations of the language-usage-patterns within a text sample, and (iii) are not associated with any outlier handling mechanism. Moreover, since there is a sizable number of people who have acquired non-English second languages due to the economic and immigration policies, there is a need to gauge the applicability of NLI with UGC to other languages. Unlike existing solutions, we define a topic-independent feature space, which makes our solution generalizable to other domains and datasets. Based on our feature space, we present a solution that mitigates the effect of outliers in the data and helps capture the variations of the language-usage-patterns within a text sample. Specifically, we represent each text sample as a
point set
and identify the top-
k
stylistically similar text samples (SSTs) from the corpus. We then apply the
probabilistic
k
nearest neighbors’
classifier on the identified top-
k
SSTs to predict the native languages of the authors. To conduct experiments, we create three new corpora where each corpus is written in a different language, namely,
English, French
, and
German
. Our experimental studies show that our solution outperforms competitive methods and reports more than 80% accuracy across languages.
Collapse
Affiliation(s)
- Raheem Sarwar
- School of Information Science and Technology, Vidyasirimedhi Institute of Science and Technology, Wangchan, Rayong, Thailand
| | - Attapol T. Rutherford
- Department of Linguistics at Faculty of Arts Chulalongkorn University, Pathumwan, Bangkok, Thailand
| | - Saeed-Ul Hassan
- Department of Computer Science, Information Technology University, Lahore, Punjab, Pakistan
| | - Thanawin Rakthanmanon
- Department of Computer Engineering, Kasetsart University, Thailand and School of Information Science and Technology, Vidyasirimedhi Institute of Science and Technology, Wangchan, Rayong, Thailand
| | - Sarana Nutanong
- School of Information Science and Technology, Vidyasirimedhi Institute of Science and Technology, Wangchan, Rayong, Thailand
| |
Collapse
|