1
|
Joshi RC, Srivastava P, Mishra R, Burget R, Dutta MK. Biomarker profiling and integrating heterogeneous models for enhanced multi-grade breast cancer prognostication. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 255:108349. [PMID: 39096573 DOI: 10.1016/j.cmpb.2024.108349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Revised: 07/01/2024] [Accepted: 07/22/2024] [Indexed: 08/05/2024]
Abstract
BACKGROUND Breast cancer remains a leading cause of female mortality worldwide, exacerbated by limited awareness, inadequate screening resources, and treatment options. Accurate and early diagnosis is crucial for improving survival rates and effective treatment. OBJECTIVES This study aims to develop an innovative artificial intelligence (AI) based model for predicting breast cancer and its various histopathological grades by integrating multiple biomarkers and subject age, thereby enhancing diagnostic accuracy and prognostication. METHODS A novel ensemble-based machine learning (ML) framework has been introduced that integrates three distinct biomarkers-beta-human chorionic gonadotropin (β-hCG), Programmed Cell Death Ligand 1 (PD-L1), and alpha-fetoprotein (AFP)-alongside subject age. Hyperparameter optimization was performed using the Particle Swarm Optimization (PSO) algorithm, and minority oversampling techniques were employed to mitigate overfitting. The model's performance was validated through rigorous five-fold cross-validation. RESULTS The proposed model demonstrated superior performance, achieving a 97.93% accuracy and a 98.06% F1-score on meticulously labeled test data across diverse age groups. Comparative analysis showed that the model outperforms state-of-the-art approaches, highlighting its robustness and generalizability. CONCLUSION By providing a comprehensive analysis of multiple biomarkers and effectively predicting tumor grades, this study offers a significant advancement in breast cancer screening, particularly in regions with limited medical resources. The proposed framework has the potential to reduce breast cancer mortality rates and improve early intervention and personalized treatment strategies.
Collapse
Affiliation(s)
- Rakesh Chandra Joshi
- Amity Centre for Artificial Intelligence, Amity University, Noida, Uttar Pradesh, India; Centre for Advanced Studies, Dr. A.P.J. Abdul Kalam Technical University, Lucknow, Uttar Pradesh, India
| | - Pallavi Srivastava
- Department of Biotechnology, Noida Institute of Engineering & Technology, Greater Noida, Uttar Pradesh, India
| | - Rashmi Mishra
- Department of Biotechnology, Noida Institute of Engineering & Technology, Greater Noida, Uttar Pradesh, India
| | - Radim Burget
- Department of Telecommunications, Faculty of Electrical Engineering and Communication, Brno University of Technology, Brno, Czech Republic
| | - Malay Kishore Dutta
- Amity Centre for Artificial Intelligence, Amity University, Noida, Uttar Pradesh, India.
| |
Collapse
|
2
|
Premkumar M, Sinha G, Ramasamy MD, Sahu S, Subramanyam CB, Sowmya R, Abualigah L, Derebew B. Augmented weighted K-means grey wolf optimizer: An enhanced metaheuristic algorithm for data clustering problems. Sci Rep 2024; 14:5434. [PMID: 38443569 PMCID: PMC10914809 DOI: 10.1038/s41598-024-55619-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 02/26/2024] [Indexed: 03/07/2024] Open
Abstract
This study presents the K-means clustering-based grey wolf optimizer, a new algorithm intended to improve the optimization capabilities of the conventional grey wolf optimizer in order to address the problem of data clustering. The process that groups similar items within a dataset into non-overlapping groups. Grey wolf hunting behaviour served as the model for grey wolf optimizer, however, it frequently lacks the exploration and exploitation capabilities that are essential for efficient data clustering. This work mainly focuses on enhancing the grey wolf optimizer using a new weight factor and the K-means algorithm concepts in order to increase variety and avoid premature convergence. Using a partitional clustering-inspired fitness function, the K-means clustering-based grey wolf optimizer was extensively evaluated on ten numerical functions and multiple real-world datasets with varying levels of complexity and dimensionality. The methodology is based on incorporating the K-means algorithm concept for the purpose of refining initial solutions and adding a weight factor to increase the diversity of solutions during the optimization phase. The results show that the K-means clustering-based grey wolf optimizer performs much better than the standard grey wolf optimizer in discovering optimal clustering solutions, indicating a higher capacity for effective exploration and exploitation of the solution space. The study found that the K-means clustering-based grey wolf optimizer was able to produce high-quality cluster centres in fewer iterations, demonstrating its efficacy and efficiency on various datasets. Finally, the study demonstrates the robustness and dependability of the K-means clustering-based grey wolf optimizer in resolving data clustering issues, which represents a significant advancement over conventional techniques. In addition to addressing the shortcomings of the initial algorithm, the incorporation of K-means and the innovative weight factor into the grey wolf optimizer establishes a new standard for further study in metaheuristic clustering algorithms. The performance of the K-means clustering-based grey wolf optimizer is around 34% better than the original grey wolf optimizer algorithm for both numerical test problems and data clustering problems.
Collapse
Affiliation(s)
- Manoharan Premkumar
- Department of Electrical & Electronics Engineering, Dayananda Sagar College of Engineering, Kumaraswamy Layout, Bengaluru, Karnataka, 560078, India.
| | - Garima Sinha
- Department of Computer Science and Engineering, Jain University, Ramanagaram, Bengaluru, Karnataka, India
| | - Manjula Devi Ramasamy
- Department of Computer Science and Engineering, KPR Institute of Engineering and Technology, Coimbatore, Tamil Nadu, India
| | - Santhoshini Sahu
- Department of Computer Science & Engineering, GMR Institute of Technology, Rajam, Srikakulam, Andhra Pradesh, India
| | | | - Ravichandran Sowmya
- Department of Electrical and Electronics Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, India
| | - Laith Abualigah
- Computer Science Department, Al al-Bayt University, Mafraq, 25113, Jordan
- Artificial Intelligence and Sensing Technologies (AIST) Research Center, University of Tabuk, 71491, Tabuk, Saudi Arabia
- Hourani Center for Applied Scientific Research, Al-Ahliyya Amman University, Amman, 19328, Jordan
- MEU Research Unit, Middle East University, Amman, 11831, Jordan
- Department of Electrical and Computer Engineering, Lebanese American University, Byblos, 13-5053, Lebanon
- School of Engineering and Technology, Sunway University Malaysia, 27500, Petaling Jaya, Malaysia
- College of Engineering, Yuan Ze University, Taoyuan, Taiwan
- Department of Statistics, College of Natural and Computational Science, Mizan-Tepi University, Tepi Bushira, Ethiopia
| | - Bizuwork Derebew
- Applied science research center, Applied science private university, Amman, 11931, Jordan.
| |
Collapse
|
3
|
Wang Y, Zhang P. Prediction of histone deacetylase inhibition by triazole compounds based on artificial intelligence. Front Pharmacol 2023; 14:1260349. [PMID: 38035010 PMCID: PMC10684768 DOI: 10.3389/fphar.2023.1260349] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 10/30/2023] [Indexed: 12/02/2023] Open
Abstract
A quantitative structure-activity relationship (QSAR) study was conducted to predict the anti-colon cancer and HDAC inhibition of triazole-containing compounds. Four descriptors were selected from 579 descriptors which have the most obvious effect on the inhibition of histone deacetylase (HDAC). Four QSAR models were constructed using heuristic algorithm (HM), random forest (RF), radial basis kernel function support vector machine (RBF-SVM) and support vector machine optimized by particle swarm optimization (PSO-SVM). Furthermore, the robustness of four QSAR models were verified by K-fold cross-validation method, which was described by Q 2. In addition, the R 2 of the four models are greater than 0.8, which indicates that the four descriptors selected are reasonable. Among the four models, model based on PSO-SVM method has the best prediction ability and robustness with R 2 of 0.954, root mean squared error (RMSE) of 0.019 and Q 2 of 0.916 for the training set and R 2 of 0.965, RMSE of 0.017 and Q 2 of 0.907 for the test set. In this study, four key descriptors were discovered, which will help to screen effective new anti-colon cancer drugs in the future.
Collapse
Affiliation(s)
| | - Peijian Zhang
- College of Computer Science and Technology, Qingdao University, Qingdao, Shandong Province, China
| |
Collapse
|
4
|
Methods in Medicine CAM. Retracted: A Systematic Literature Review on Particle Swarm Optimization Techniques for Medical Diseases Detection. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2023; 2023:9825640. [PMID: 37564750 PMCID: PMC10412168 DOI: 10.1155/2023/9825640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Accepted: 08/01/2023] [Indexed: 08/12/2023]
Abstract
[This retracts the article DOI: 10.1155/2021/5990999.].
Collapse
|
5
|
Shan G, Wu X, Li G, Xing C, Zhang S, Fu Y. Thermodynamic Multi-Field Coupling Optimization of Microsystem Based on Artificial Intelligence. MICROMACHINES 2023; 14:411. [PMID: 36838112 PMCID: PMC9963334 DOI: 10.3390/mi14020411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 02/05/2023] [Accepted: 02/07/2023] [Indexed: 06/18/2023]
Abstract
An efficient multi-objective optimization method of temperature and stress for a microsystem based on particle swarm optimization (PSO) was established, which is used to map the relationship between through-silicon via (TSV) structural design parameters and performance objectives in the microsystem, and complete optimization temperature, stress and thermal expansion deformation efficiently. The relationship between the design and performance parameters is obtained by a finite element method (FEM) simulation model. The neural network is built and trained in order to understand the mapping relationship. Then, the design parameters are iteratively optimized using the PSO algorithm, and the FEM results are used to verify the efficiency and reliability of the optimization methods. When the optimization target of peak temperature, bump temperature, TSV temperature, maximum stress and maximum thermal deformation are set as 100 °C, 55 °C, 35 °C, 180 Mpa and 12 μm, the optimization results are as follows: the peak temperature is 97.90 °C, the bump temperature is 56.01 °C, the TSV temperature is 31.52 °C, the maximum stress is 247.4 Mpa and the maximum expansion deformation is 11.14 μm. The corresponding TSV structure design parameters are as follows: the radius of TSV is 10.28 μm, the pitch is 65 μm and the thickness of SiO2 is 0.83 μm. The error between the optimization result and the target temperature is 2.1%, 1.8%, 9.9%, 37.4% and 7.2% respectively. The PSO method has been verified by regression analysis, and the difference between the temperature and deformation optimization results of the FEM method is not more than 3%. The stress error has been analyzed, and the reliability of the developed method has been verified. While ensuring the accuracy of the results, the proposed optimization method reduces the time consumption of a single simulation from 2 h to 70 s, saves a lot of time and human resources, greatly improves the efficiency of the optimization design of microsystems, and has great significance for the development of microsystems.
Collapse
Affiliation(s)
- Guangbao Shan
- School of Microelectronics, Xidian University, Xi’an 710071, China
| | - Xudong Wu
- School of Microelectronics, Xidian University, Xi’an 710071, China
| | - Guoliang Li
- School of Microelectronics, Xidian University, Xi’an 710071, China
| | - Chaoyang Xing
- Beijing Institute of Aerospace Control Devices, Beijing 100039, China
| | - Shengchang Zhang
- School of Microelectronics, Xidian University, Xi’an 710071, China
| | - Yu Fu
- China Academy of Aerospace Standardization and Product Assurance, Beijing 100071, China
| |
Collapse
|
6
|
Wang ZJ, Yang Q, Zhang YH, Chen SH, Wang YG. Superiority combination learning distributed particle swarm optimization for large-scale optimization. Appl Soft Comput 2023. [DOI: 10.1016/j.asoc.2023.110101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023]
|
7
|
Asiri Y. Computing Drug-Drug Similarity from Patient-Centric Data. Bioengineering (Basel) 2023; 10:bioengineering10020182. [PMID: 36829676 PMCID: PMC9952733 DOI: 10.3390/bioengineering10020182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 01/22/2023] [Accepted: 01/29/2023] [Indexed: 02/04/2023] Open
Abstract
In modern biology and medicine, drug-drug similarity is a major task with various applications in pharmaceutical drug development. Various direct and indirect sources of evidence obtained from drug-centric data such as side effects, drug interactions, biological targets, and chemical structures are used in the current methods to measure the level of drug-drug similarity. This paper proposes a computational method to measure drug-drug similarity using a novel source of evidence that is obtained from patient-centric data. More specifically, patients' narration of their thoughts, opinions, and experience with drugs in social media are explored as a potential source to compute drug-drug similarity. Online healthcare communities were used to extract a dataset of patients' reviews on anti-epileptic drugs. The collected dataset is preprocessed through Natural Language Processing (NLP) techniques and four text similarity methods are applied to measure the similarities among them. The obtained similarities are then used to generate drug-drug similarity-based ranking matrices which are analyzed through Pearson correlation, to answer questions related to the overall drug-drug similarity and the accuracy of the four similarity measures. To evaluate the obtained drug-drug similarities, they are compared with the corresponding ground-truth similarities obtained from DrugSimDB, a well-known drug-drug similarity tool that is based on drug-centric data. The results provide evidence on the feasibility of patient-centric data from social media as a novel source for computing drug-drug similarity.
Collapse
Affiliation(s)
- Yousef Asiri
- Department of Computer Science, Najran University, Najran 61441, Saudi Arabia
| |
Collapse
|
8
|
Pan S, Gupta TK, Raza K. BatTS: a hybrid method for optimizing deep feedforward neural network. PeerJ Comput Sci 2023; 9:e1194. [PMID: 37346535 PMCID: PMC10280266 DOI: 10.7717/peerj-cs.1194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Accepted: 11/30/2022] [Indexed: 06/23/2023]
Abstract
Deep feedforward neural networks (DFNNs) have attained remarkable success in almost every computational task. However, the selection of DFNN architecture is still based on handcraft or hit-and-trial methods. Therefore, an essential factor regarding DFNN is about designing its architecture. Unfortunately, creating architecture for DFNN is a very laborious and time-consuming task for performing state-of-art work. This article proposes a new hybrid methodology (BatTS) to optimize the DFNN architecture based on its performance. BatTS is a result of integrating the Bat algorithm, Tabu search (TS), and Gradient descent with a momentum backpropagation training algorithm (GDM). The main features of the BatTS are the following: a dynamic process of finding new architecture based on Bat, the skill to escape from local minima, and fast convergence in evaluating new architectures based on the Tabu search feature. The performance of BatTS is compared with the Tabu search based approach and random trials. The process goes through an empirical evaluation of four different benchmark datasets and shows that the proposed hybrid methodology has improved performance over existing techniques which are mainly random trials.
Collapse
Affiliation(s)
- Sichen Pan
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, Guangdong Province, China
| | - Tarun Kumar Gupta
- Department of Computer Science, Jamia Millia Islamia, New Delhi, Delhi, India
| | - Khalid Raza
- Department of Computer Science, Jamia Millia Islamia, New Delhi, Delhi, India
| |
Collapse
|
9
|
Zhang J, Liang R, Lau N, Lei Q, Yip J. A Systematic Analysis of 3D Deformation of Aging Breasts Based on Artificial Neural Networks. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 20:468. [PMID: 36612790 PMCID: PMC9819929 DOI: 10.3390/ijerph20010468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 12/21/2022] [Accepted: 12/22/2022] [Indexed: 06/17/2023]
Abstract
The measurement and prediction of breast skin deformation are key research directions in health-related research areas, such as cosmetic and reconstructive surgery and sports biomechanics. However, few studies have provided a systematic analysis on the deformations of aging breasts. Thus, this study has developed a model order reduction approach to predict the real-time strain of the breast skin of seniors during movement. Twenty-two women who are on average 62 years old participated in motion capture experiments, in which eight body variables were first extracted by using the gray relational method. Then, backpropagation artificial neural networks were built to predict the strain of the breast skin. After optimization, the R-value for the neural network model reached 0.99, which is within acceptable accuracy. The computer-aided system of this study is validated as a robust simulation approach for conducting biomechanical analyses and predicting breast deformation.
Collapse
Affiliation(s)
- Jun Zhang
- School of Fashion and Textiles, The Hong Kong Polytechnic University, Hung Hom, Hong Kong, China
| | - Ruixin Liang
- Laboratory for Artificial Intelligence in Design, Hong Kong Science Park, New Territories, Hong Kong, China
| | - Newman Lau
- School of Design, The Hong Kong Polytechnic University, Hung Hom, Hong Kong, China
| | - Qiwen Lei
- School of Fashion and Textiles, The Hong Kong Polytechnic University, Hung Hom, Hong Kong, China
| | - Joanne Yip
- School of Fashion and Textiles, The Hong Kong Polytechnic University, Hung Hom, Hong Kong, China
- Laboratory for Artificial Intelligence in Design, Hong Kong Science Park, New Territories, Hong Kong, China
| |
Collapse
|
10
|
Talebi F, Nazemi A, Ataabadi AA. Mean-AVaR in credibilistic portfolio management via an artificial neural network scheme. J EXP THEOR ARTIF IN 2022. [DOI: 10.1080/0952813x.2022.2153271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Affiliation(s)
- Fatemeh Talebi
- Faculty of Mathematical sciences, Shahrood University of Technology, Shahrood, Iran
| | - Alireza Nazemi
- Faculty of Mathematical sciences, Shahrood University of Technology, Shahrood, Iran
| | - Abdolmajid Abdolbaghi Ataabadi
- Department of Management, Faculty of Industrial Engineering and Management, Shahrood University of Technology, Shahrood, Iran
| |
Collapse
|
11
|
Rekha KS, Sabu MK. A cooperative deep learning model for stock market prediction using deep autoencoder and sentiment analysis. PeerJ Comput Sci 2022; 8:e1158. [PMID: 36532805 PMCID: PMC9748829 DOI: 10.7717/peerj-cs.1158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 10/25/2022] [Indexed: 06/17/2023]
Abstract
Stock market prediction is a challenging and complex problem that has received the attention of researchers due to the high returns resulting from an improved prediction. Even though machine learning models are popular in this domain dynamic and the volatile nature of the stock markets limits the accuracy of stock prediction. Studies show that incorporating news sentiment in stock market predictions enhances performance compared to models using stock features alone. There is a need to develop an architecture that facilitates noise removal from stock data, captures market sentiments, and ensures prediction to a reasonable degree of accuracy. The proposed cooperative deep-learning architecture comprises a deep autoencoder, lexicon-based software for sentiment analysis of news headlines, and LSTM/GRU layers for prediction. The autoencoder is used to denoise the historical stock data, and the denoised data is transferred into the deep learning model along with news sentiments. The stock data is concatenated with the sentiment score and is fed to the LSTM/GRU model for output prediction. The model's performance is evaluated using the standard measures used in the literature. The results show that the combined model using deep autoencoder with news sentiments performs better than the standalone LSTM/GRU models. The performance of our model also compares favorably with state-of-the-art models in the literature.
Collapse
Affiliation(s)
- KS Rekha
- Department of Computer Applications, Cochin University of Science and Technology, Kochi, Kerala, India
- Department of Computer Science and Engineering, College of Engineering Kidangoor, Kottayam, Kerala, India
| | - MK Sabu
- Department of Computer Applications, Cochin University of Science and Technology, Kochi, Kerala, India
| |
Collapse
|
12
|
Yuan C, Li X. Fitting of TC model according to key parameters affecting Parkinson's state based on improved particle swarm optimization algorithm. Sci Rep 2022; 12:13938. [PMID: 35977977 PMCID: PMC9385711 DOI: 10.1038/s41598-022-18267-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2021] [Accepted: 08/08/2022] [Indexed: 11/10/2022] Open
Abstract
Biophysical models contain a large number of parameters, while the spiking characteristics of neurons are related to a few key parameters. For thalamic neurons, relay reliability is an important characteristic that affects Parkinson's state. This paper proposes a method to fit key parameters of the model based on the spiking characteristics of neurons, and improves the traditional particle swarm optimization algorithm. That is, a nonlinear concave function and a Logistic chaotic mapping are combined to adjust the inertia weight of particles to avoid the particle falling into a local optimum in the search process or appearing premature convergence. In this paper, three parameters that play an important role in Parkinson's state of the thalamic cell model are selected and fitted by the improved particle swarm optimization algorithm. Using the fitted parameters to reconstruct the neuron model can predict the spiking trajectories well, which verifies the effectiveness of the fitting method. By comparing the fitting results with other particle swarm optimization algorithms, it is shown that the proposed particle swarm optimization algorithm can better avoid local optima and converge to the optimal values quickly.
Collapse
Affiliation(s)
- Chunhua Yuan
- School of Automation and Electrical Engineering, Shenyang Ligong University, Shenyang, 110159, China
| | - Xiangyu Li
- School of Automation and Electrical Engineering, Shenyang Ligong University, Shenyang, 110159, China.
| |
Collapse
|
13
|
Geometry-V-Sub: An Efficient Graph Attention Network Struct Based Model for Node Classification. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12147246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
With the development of deep learning and graph deep learning, the network structure is more and more complex, and the parameters in the network model and the computing resources and storage resources required are increasing. The lightweight design and optimization of the network structure is conducive to reducing the required computing resources and storage resources, reducing the requirements of the network model on the computing environment, increasing its scope of application, reducing the consumption of energy in computing, and is conducive to environmental protection. The contribution of this paper is that Geometry-V-Sub is a graph learning structure based on spatial geometry, which can greatly reduce the parameter requirements and only lose a little accuracy. The number of parameters is only 13.05–16.26% of baseline model, and the accuracy of Cora, Citeseer and PubMed is max to 80.4%, 68% and 81.8%, respectively. When the number of parameters is only 12.01% of baseline model, F1 score is max to 98.4.
Collapse
|
14
|
An Iterative Backbone Algorithm for Service Network Design Problems. Processes (Basel) 2022. [DOI: 10.3390/pr10071373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Service network design problems arise at airlines, trucking companies, and railroads wherever there is a need to determine cost-minimizing routes and schedules, given resource availability and service constraints. In recent years, the application of consolidation-based service network design in the express service has attracted lots of academic attention due to the rapid growth of the express industry. This paper studies the consolidation-based service network design problem, which jointly determines the commodity flow, vehicle dispatching, and fleet sizing. We propose a mixed-integer optimization model to address the problem and design an efficient iterative backbone algorithm to solve large-scale real-world problems. The numerical results of large-scale instances confirmed that the solution obtained by our proposed algorithm is better than that of the primal model, and the running time taken is less than half that of the general solution approach. The computational study confirmed the effectiveness and efficiency of the proposed algorithm.
Collapse
|
15
|
Effective Realization of Multi-Objective Elitist Teaching–Learning Based Optimization Technique for the Micro-Siting of Wind Turbines. SUSTAINABILITY 2022. [DOI: 10.3390/su14148458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
In this paper, the meta-heuristic multi-objective elitist teaching–learning based optimization technique is implemented for wind farm layout discrete optimization problem. The optimization of wind farm layout addresses the optimum siting among the wind turbines within the wind farm to accomplish economical, profitable, and technical features. The presented methodology is implemented with multi-objective optimization problem through different targets such as minimizing cost, power output maximization, and the saving of the number of turbines. These targets are investigated with some case studies of multi-objective optimization problems in three scenarios of wind (Scenario-I: fixed wind direction and constant speed, Scenario-II: variable wind direction and constant speed, and Scenario-III: variable wind direction and variable speed) for the optimal micro-siting of wind turbines in a given land area that maximizes the power production while minimizing the total cost. To check the effectiveness of the algorithm, firstly, the results obtained for the three different scenarios have been compared with past studies available in the literature. Secondly, the numbers of turbines have also been optimized by using teaching–learning based optimization. It has been observed that the proposed algorithm shows the optimal layouts along with the optimal number of turbines with minimum fitness evaluation. Finally, the concept of elitism has been introduced in the teaching–learning based optimization algorithm. It is proposed that if elitist-teaching–learning based optimization with elite size of 15% is used, computational expense can be significantly reduced. It can be concluded that that the results obtained by the proposed algorithm are more accurate and advantageous than others.
Collapse
|
16
|
A Predictive Checkpoint Technique for Iterative Phase of Container Migration. SUSTAINABILITY 2022. [DOI: 10.3390/su14116538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Cloud computing is a cost-effective method of delivering numerous services in Industry 4.0. The demand for dynamic cloud services is rising day by day and, because of this, data transit across the network is extensive. Virtualization is a significant component and the cloud servers might be physical or virtual. Containerized services are essential for reducing data transmission, cost, and time, among other things. Containers are lightweight virtual environments that share the host operating system’s kernel. The majority of businesses are transitioning from virtual machines to containers. The major factor affecting the performance is the amount of data transfer over the network. It has a direct impact on the migration time, downtime and cost. In this article, we propose a predictive iterative-dump approach using long short-term memory (LSTM) to anticipate which memory pages will be moved, by limiting data transmission during the iterative phase. In each loop, the pages are shortlisted to be migrated to the destination host based on predictive analysis of memory alterations. Dirty pages will be predicted and discarded using a prediction technique based on the alteration rate. The results show that the suggested technique surpasses existing alternatives in overall migration time and amount of data transmitted. There was a 49.42% decrease in migration time and a 31.0446% reduction in the amount of data transferred during the iterative phase.
Collapse
|
17
|
A Particle Swarm Optimization Backtracking Technique Inspired by Science-Fiction Time Travel. AI 2022. [DOI: 10.3390/ai3020024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Artificial intelligence techniques, such as particle swarm optimization, are used to solve problems throughout society. Optimization, in particular, seeks to identify the best possible decision within a search space. Problematically, particle swarm optimization will sometimes have particles that become trapped inside local minima, preventing them from identifying a global optimal solution. As a solution to this issue, this paper proposes a science-fiction inspired enhancement of particle swarm optimization where an impactful iteration is identified and the algorithm is rerun from this point, with a change made to the swarm. The proposed technique is tested using multiple variations on several different functions representing optimization problems and several standard test functions used to test various particle swarm optimization techniques.
Collapse
|
18
|
Strengthening intrusion detection system for adversarial attacks: improved handling of imbalance classification problem. COMPLEX INTELL SYST 2022. [DOI: 10.1007/s40747-022-00739-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
AbstractMost defence mechanisms such as a network-based intrusion detection system (NIDS) are often sub-optimal for the detection of an unseen malicious pattern. In response, a number of studies attempt to empower a machine-learning-based NIDS to improve the ability to recognize adversarial attacks. Along this line of research, the present work focuses on non-payload connections at the TCP stack level, which is generalized and applicable to different network applications. As a compliment to the recently published investigation that searches for the most informative feature space for classifying obfuscated connections, the problem of class imbalance is examined herein. In particular, a multiple-clustering-based undersampling framework is proposed to determine the set of cluster centroids that best represent the majority class, whose size is reduced to be on par with that of the minority. Initially, a pool of centroids is created using the concept of ensemble clustering that aims to obtain a collection of accurate and diverse clusterings. From that, the final set of representatives is selected from this pool. Three different objective functions are formed for this optimization driven process, thus leading to three variants of FF-Majority, FF-Minority and FF-Overall. Based on the thorough evaluation of a published dataset, four classification models and different settings, these new methods often exhibit better predictive performance than its baseline, the single-clustering undersampling counterpart and state-of-the-art techniques. Parameter analysis and implication for analyzing an extreme case are also provided as a guideline for future applications.
Collapse
|
19
|
Saxena A, Rubens M, Ramamoorthy V, Zhang Z, Ahmed MA, McGranaghan P, Das S, Veledar E. A Brief Overview of Adaptive Designs for Phase I Cancer Trials. Cancers (Basel) 2022; 14:cancers14061566. [PMID: 35326715 PMCID: PMC8946506 DOI: 10.3390/cancers14061566] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 03/16/2022] [Accepted: 03/17/2022] [Indexed: 12/18/2022] Open
Abstract
Simple Summary Phase I cancer trials are important for new drug developments to test the safety and optimal dosage of cancer drugs which are usually toxic. Understanding biostatistical methodologies of these designs is important for developing phase I studies that are both safe for the participants and which use optimal dosages for better outcomes. Currently there are several phase I designs that are being refined and modified for better outcomes and newer designs are being continuously developed. In this review article, we described several important phase I study designs to provide a brief overview of existing methods. Our review could be helpful to the research community who intent to have a better and yet a concise summary of existing methods. Abstract Phase I studies are used to estimate the dose-toxicity profile of the drugs and to select appropriate doses for successive studies. However, literature on statistical methods used for phase I studies are extensive. The objective of this review is to provide a concise summary of existing and emerging techniques for selecting dosages that are appropriate for phase I cancer trials. Many advanced statistical studies have proposed novel and robust methods for adaptive designs that have shown significant advantages over conventional dose finding methods. An increasing number of phase I cancer trials use adaptive designs, particularly during the early phases of the study. In this review, we described nonparametric and algorithm-based designs such as traditional 3 + 3, accelerated titration, Bayesian algorithm-based design, up-and-down design, and isotonic design. In addition, we also described parametric model-based designs such as continual reassessment method, escalation with overdose control, and Bayesian decision theoretic and optimal design. Ongoing studies have been continuously focusing on improving and refining the existing models as well as developing newer methods. This study would help readers to assimilate core concepts and compare different phase I statistical methods under one banner. Nevertheless, other evolving methods require future reviews.
Collapse
Affiliation(s)
- Anshul Saxena
- Center for Advanced Analytics, Baptist Health South Florida, Miami, FL 33176, USA; (V.R.); (Z.Z.); (M.A.A.); (E.V.)
- Robert Stempel College of Public Health & Social Work, Florida International University, Miami, FL 33199, USA
- Correspondence: (A.S.); (P.M.)
| | - Muni Rubens
- Miami Cancer Institute, Baptist Health South Florida, Miami, FL 33176, USA;
| | - Venkataraghavan Ramamoorthy
- Center for Advanced Analytics, Baptist Health South Florida, Miami, FL 33176, USA; (V.R.); (Z.Z.); (M.A.A.); (E.V.)
| | - Zhenwei Zhang
- Center for Advanced Analytics, Baptist Health South Florida, Miami, FL 33176, USA; (V.R.); (Z.Z.); (M.A.A.); (E.V.)
| | - Md Ashfaq Ahmed
- Center for Advanced Analytics, Baptist Health South Florida, Miami, FL 33176, USA; (V.R.); (Z.Z.); (M.A.A.); (E.V.)
| | - Peter McGranaghan
- Miami Cancer Institute, Baptist Health South Florida, Miami, FL 33176, USA;
- Department of Internal Medicine and Cardiology, Charité—Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, 10117 Berlin, Germany
- Correspondence: (A.S.); (P.M.)
| | - Sankalp Das
- Wellness and Employee Health, Baptist Health South Florida, Miami, FL 33176, USA;
| | - Emir Veledar
- Center for Advanced Analytics, Baptist Health South Florida, Miami, FL 33176, USA; (V.R.); (Z.Z.); (M.A.A.); (E.V.)
- Robert Stempel College of Public Health & Social Work, Florida International University, Miami, FL 33199, USA
| |
Collapse
|
20
|
Nekooei A, Safari S. Compression of Deep Neural Networks based on quantized tensor decomposition to implement on reconfigurable hardware platforms. Neural Netw 2022; 150:350-363. [PMID: 35344706 DOI: 10.1016/j.neunet.2022.02.024] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 11/28/2021] [Accepted: 02/24/2022] [Indexed: 11/24/2022]
Abstract
Deep Neural Networks (DNNs) have been vastly and successfully employed in various artificial intelligence and machine learning applications (e.g., image processing and natural language processing). As DNNs become deeper and enclose more filters per layer, they incur high computational costs and large memory consumption to preserve their large number of parameters. Moreover, present processing platforms (e.g., CPU, GPU, and FPGA) have not enough internal memory, and hence external memory storage is needed. Hence deploying DNNs on mobile applications is difficult, considering the limited storage space, computation power, energy supply, and real-time processing requirements. In this work, using a method based on tensor decomposition, network parameters were compressed, thereby reducing access to external memory. This compression method decomposes the network layers' weight tensor into a limited number of principal vectors such that (i) almost all the initial parameters can be retrieved, (ii) the network structure did not change, and (iii) the network quality after reproducing the parameters was almost similar to the original network in terms of detection accuracy. To optimize the realization of this method on FPGA, the tensor decomposition algorithm was modified while its convergence was not affected, and the reproduction of network parameters on FPGA was straightforward. The proposed algorithm reduced the parameters of ResNet50, VGG16, and VGG19 networks trained with Cifar10 and Cifar100 by almost 10 times.
Collapse
|
21
|
A Divide-and-Conquer Bat Algorithm with Direction of Mean Best Position for Optimization of Cutting Parameters in CNC Turnings. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:4719266. [PMID: 35251149 PMCID: PMC8890851 DOI: 10.1155/2022/4719266] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 01/04/2022] [Accepted: 01/13/2022] [Indexed: 11/18/2022]
Abstract
Optimization of machining parameters is an important problem in the modern manufacturing world due to production efficiency and economics. This problem is well known to be complex and is regarded as a strongly nondeterministic polynomial (NP)-hard problem. To reduce the production cost of work-pieces in computer numerical control (CNC) machining, a novel optimization algorithm based on a combination of the bat algorithm and a divide-and-conquer strategy is proposed. First, the basic bat algorithm (BA) is modified with the aim to avoid finding the local optimal solution. In addition, a Gaussian quantum bat algorithm with direction of mean best position is developed. Second, in order to reduce the complexity of the optimization problem, the whole optimization problem is divided into several subproblems by using a divide-and-conquer strategy according to the characteristic of multipass turning operations. Finally, under a large number of machining constraints, the cutting parameters of the two stages of roughing and finishing are simultaneously optimized. Simulation results show that the proposed algorithm can find better combinations of the machining parameters than other algorithms proposed previously to further reduce the production cost. In addition, the outcome of our work presents a novel way to solve the complex optimization problem of machining parameters with a combination of traditional mathematical methods and swarm intelligence algorithms.
Collapse
|
22
|
Zhang Y, Mu X, Liu X, Wang X, Zhang X, Li K, Wu T, Zhao D, Dong C. Applying the quantum approximate optimization algorithm to the minimum vertex cover problem. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.108554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
23
|
Kogilavani SV, Prabhu J, Sandhiya R, Kumar MS, Subramaniam U, Karthick A, Muhibbullah M, Imam SBS. COVID-19 Detection Based on Lung Ct Scan Using Deep Learning Techniques. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:7672196. [PMID: 35116074 PMCID: PMC8805449 DOI: 10.1155/2022/7672196] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 01/07/2022] [Indexed: 12/17/2022]
Abstract
SARS-CoV-2 is a novel virus, responsible for causing the COVID-19 pandemic that has emerged as a pandemic in recent years. Humans are becoming infected with the virus. In 2019, the city of Wuhan reported the first-ever incidence of COVID-19. COVID-19 infected people have symptoms that are related to pneumonia, and the virus affects the body's respiratory organs, making breathing difficult. A real-time reverse transcriptase-polymerase chain reaction (RT-PCR) kit is used to diagnose the disease. Due to a shortage of kits, suspected patients cannot be treated promptly, resulting in disease spread. To develop an alternative, radiologists looked at the changes in radiological imaging, like CT scans, that produce comprehensive pictures of the body of excellent quality. The suspected patient's computed tomography (CT) scan is used to distinguish between a healthy individual and a COVID-19 patient using deep learning algorithms. A lot of deep learning methods have been proposed for COVID-19. The proposed work utilizes CNN architectures like VGG16, DeseNet121, MobileNet, NASNet, Xception, and EfficientNet. The dataset contains 3873 total CT scan images with "COVID" and "Non-COVID." The dataset is divided into train, test, and validation. Accuracies obtained for VGG16 are 97.68%, DenseNet121 is 97.53%, MobileNet is 96.38%, NASNet is 89.51%, Xception is 92.47%, and EfficientNet is 80.19%, respectively. From the obtained analysis, the results show that the VGG16 architecture gives better accuracy compared to other architectures.
Collapse
Affiliation(s)
- S. V. Kogilavani
- . Department of Computer Science and Engineering, Kongu Engineering College, Perundurai, Erode 638060, Tamil Nadu, India
| | - J. Prabhu
- School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| | - R. Sandhiya
- . Department of Computer Science and Engineering, Kongu Engineering College, Perundurai, Erode 638060, Tamil Nadu, India
| | - M. Sandeep Kumar
- School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| | - UmaShankar Subramaniam
- Renewable Energy Lab, College of Engineering, Prince Sultan University, Riyadh, Saudi Arabia 11586
- Department of Energy and Environmental Engineering, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Saveetha Nagar, Thandalam, Chennai-602105, Tamilnadu, India
| | - Alagar Karthick
- Renewable Energy Lab, Department of Electrical and Electronics Engineering, KPR Institute of Engineering and Technology, Coimbatore, 641407 Tamilnadu, India
| | - M. Muhibbullah
- Department of Electrical and Electronic Engineering, Bangladesh University, Dhaka 1207, Bangladesh
| | - Sharmila Banu Sheik Imam
- College of Computer Science & Information Technology (CCSIT), King Faisal University, Alahsa, Saudi Arabia 31982
| |
Collapse
|
24
|
A Fine-Tuned BERT-Based Transfer Learning Approach for Text Classification. JOURNAL OF HEALTHCARE ENGINEERING 2022; 2022:3498123. [PMID: 35013691 PMCID: PMC8742153 DOI: 10.1155/2022/3498123] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 11/25/2021] [Accepted: 12/03/2021] [Indexed: 01/10/2023]
Abstract
Text Classification problem has been thoroughly studied in information retrieval problems and data mining tasks. It is beneficial in multiple tasks including medical diagnose health and care department, targeted marketing, entertainment industry, and group filtering processes. A recent innovation in both data mining and natural language processing gained the attention of researchers from all over the world to develop automated systems for text classification. NLP allows categorizing documents containing different texts. A huge amount of data is generated on social media sites through social media users. Three datasets have been used for experimental purposes including the COVID-19 fake news dataset, COVID-19 English tweet dataset, and extremist-non-extremist dataset which contain news blogs, posts, and tweets related to coronavirus and hate speech. Transfer learning approaches do not experiment on COVID-19 fake news and extremist-non-extremist datasets. Therefore, the proposed work applied transfer learning classification models on both these datasets to check the performance of transfer learning models. Models are trained and evaluated on the accuracy, precision, recall, and F1-score. Heat maps are also generated for every model. In the end, future directions are proposed.
Collapse
|
25
|
Yahya AA, Asiri Y, Alyami I. Social Media Analytics for Pharmacovigilance of Antiepileptic Drugs. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:8965280. [PMID: 35027943 PMCID: PMC8752219 DOI: 10.1155/2022/8965280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Accepted: 12/04/2021] [Indexed: 11/17/2022]
Abstract
Epilepsy is a common neurological disorder worldwide and antiepileptic drug (AED) therapy is the cornerstone of its treatment. It has a laudable aim of achieving seizure freedom with minimal, if any, adverse drug reactions (ADRs). Too often, AED treatment is a long-lasting journey, in which ADRs have a crucial role in its administration. Therefore, from a pharmacovigilance perspective, detecting the ADRs of AEDs is a task of utmost importance. Typically, this task is accomplished by analyzing relevant data from spontaneous reporting systems. Despite their wide adoption for pharmacovigilance activities, the passiveness and high underreporting ratio associated with spontaneous reporting systems have encouraged the consideration of other data sources such as electronic health databases and pharmaceutical databases. Social media is the most recent alternative data source with many promising potentials to overcome the shortcomings of traditional data sources. Although in the literature some attempts have investigated the validity and utility of social media for ADR detection of different groups of drugs, none of them was dedicated to the ADRs of AEDs. Hence, this paper presents a novel investigation of the validity and utility of social media as an alternative data source for the detection of AED ADRs. To this end, a dataset of consumer reviews from two online health communities has been collected. The dataset is preprocessed; the unigram, bigram, and trigram are generated; and the ADRs of each AED are extracted with the aid of consumer health vocabulary and ADR lexicon. Three widely used measures, namely, proportional reporting ratio, reporting odds ratio, and information component, are used to measure the association between each ADR and AED. The resulting list of signaled ADRs for each AED is validated against a widely used ADR database, called Side Effect Resource, in terms of the precision of ADR detection. The validation results indicate the validity of online health community data for the detection of AED ADRs. Furthermore, the lists of signaled AED ADRs are analyzed to answer questions related to the common ADRs of AEDs and the similarities between AEDs in terms of their signaled ADRs. The consistency of the drawn answers with the existing pharmaceutical knowledge suggests the utility of the data from online health communities for AED-related knowledge discovery tasks.
Collapse
Affiliation(s)
- Anwar Ali Yahya
- Department of Computer Science, Najran University, Najran, Saudi Arabia
| | - Yousef Asiri
- Department of Computer Science, Najran University, Najran, Saudi Arabia
| | - Ibrahim Alyami
- Department of Computer Science, Najran University, Najran, Saudi Arabia
| |
Collapse
|
26
|
Resource Efficient VM placement in Cloud Environment using Improved Particle Swarm Optimization. INTERNATIONAL JOURNAL OF APPLIED METAHEURISTIC COMPUTING 2022. [DOI: 10.4018/ijamc.298312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Fundamentally, a strategy considering the effective utilization of resources results in the better energy efficiency of the system. The aroused interest of users in cloud computing has led to an increased power consumption making the network operation costly. The frequent requests from the users asking for computing resources can lead to instability in the load of the computing system. To perform the load balancing in the host, migration of the virtual machines from the overloaded and underloaded hosts needs to be done, which is considered an important facet concerning energy consumption. The proposed Particle Swarm Optimization based Resource Aware VM Placement (RAPSO_VMP) scheme aims to place the migrated virtual machines. RAPSO_VMP takes into consideration multiple resources like CPU, storage, and memory while trying to optimize the overall resource utilization of the system. According to the simulation analysis, the proposed RAPSO_VMP scheme shows an improvement of 5.51% in energy consumption, reduced the number of migrations by 9.12%, and the number of hosts shutdowns 22.74%.
Collapse
|
27
|
Bangyal WH, Qasim R, Rehman NU, Ahmad Z, Dar H, Rukhsar L, Aman Z, Ahmad J. Detection of Fake News Text Classification on COVID-19 Using Deep Learning Approaches. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2021; 2021:5514220. [PMID: 34819990 PMCID: PMC8608495 DOI: 10.1155/2021/5514220] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/27/2021] [Accepted: 10/15/2021] [Indexed: 01/10/2023]
Abstract
A vast amount of data is generated every second for microblogs, content sharing via social media sites, and social networking. Twitter is an essential popular microblog where people voice their opinions about daily issues. Recently, analyzing these opinions is the primary concern of Sentiment analysis or opinion mining. Efficiently capturing, gathering, and analyzing sentiments have been challenging for researchers. To deal with these challenges, in this research work, we propose a highly accurate approach for SA of fake news on COVID-19. The fake news dataset contains fake news on COVID-19; we started by data preprocessing (replace the missing value, noise removal, tokenization, and stemming). We applied a semantic model with term frequency and inverse document frequency weighting for data representation. In the measuring and evaluation step, we applied eight machine-learning algorithms such as Naive Bayesian, Adaboost, K-nearest neighbors, random forest, logistic regression, decision tree, neural networks, and support vector machine and four deep learning CNN, LSTM, RNN, and GRU. Afterward, based on the results, we boiled a highly efficient prediction model with python, and we trained and evaluated the classification model according to the performance measures (confusion matrix, classification rate, true positives rate...), then tested the model on a set of unclassified fake news on COVID-19, to predict the sentiment class of each fake news on COVID-19. Obtained results demonstrate a high accuracy compared to the other models. Finally, a set of recommendations is provided with future directions for this research to help researchers select an efficient sentiment analysis model on Twitter data.
Collapse
Affiliation(s)
| | - Rukhma Qasim
- Department of Computer Science, University of Gujrat, Pakistan
| | | | - Zeeshan Ahmad
- Department of Computer Science, University of Gujrat, Pakistan
| | - Hafsa Dar
- Department of Software Engineering, University of Gujrat, Pakistan
| | - Laiqa Rukhsar
- Department of Computer Science, University of Gujrat, Pakistan
| | - Zahra Aman
- Department of Computer Science, University of Gujrat, Pakistan
| | - Jamil Ahmad
- Professor Computer Science, Hazara University, Manshera, KPK, Pakistan
| |
Collapse
|