1
|
Al-Shalif SA, Senan N, Saeed F, Ghaban W, Ibrahim N, Aamir M, Sharif W. A systematic literature review on meta-heuristic based feature selection techniques for text classification. PeerJ Comput Sci 2024; 10:e2084. [PMID: 38983195 PMCID: PMC11232610 DOI: 10.7717/peerj-cs.2084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 05/03/2024] [Indexed: 07/11/2024]
Abstract
Feature selection (FS) is a critical step in many data science-based applications, especially in text classification, as it includes selecting relevant and important features from an original feature set. This process can improve learning accuracy, streamline learning duration, and simplify outcomes. In text classification, there are often many excessive and unrelated features that impact performance of the applied classifiers, and various techniques have been suggested to tackle this problem, categorized as traditional techniques and meta-heuristic (MH) techniques. In order to discover the optimal subset of features, FS processes require a search strategy, and MH techniques use various strategies to strike a balance between exploration and exploitation. The goal of this research article is to systematically analyze the MH techniques used for FS between 2015 and 2022, focusing on 108 primary studies from three different databases such as Scopus, Science Direct, and Google Scholar to identify the techniques used, as well as their strengths and weaknesses. The findings indicate that MH techniques are efficient and outperform traditional techniques, with the potential for further exploration of MH techniques such as Ringed Seal Search (RSS) to improve FS in several applications.
Collapse
Affiliation(s)
- Sarah Abdulkarem Al-Shalif
- Faculty of Computer Science and Information Technology, Universiti Tun Hussein Onn Malaysia, Parit Raja, Johor, Malaysia
| | - Norhalina Senan
- Faculty of Computer Science and Information Technology, Universiti Tun Hussein Onn Malaysia, Parit Raja, Johor, Malaysia
| | - Faisal Saeed
- DAAI Research Group, Department of Computing and Data Science, School of Computing and Digital Technology, University of Birmingham, Birmingham, United Kingdom
| | - Wad Ghaban
- Applied College, University of Tabuk, Tabuk, Saudi Arabia
| | - Noraini Ibrahim
- Faculty of Computer Science and Information Technology, Universiti Tun Hussein Onn Malaysia, Parit Raja, Johor, Malaysia
| | - Muhammad Aamir
- School of Electronics, Computing and Mathematics,, University of Derby, Derby, United Kingdom
| | - Wareesa Sharif
- Faculty of Computing, The Islamia University of Bahawalpur, Bahawalpur, Pakistan
| |
Collapse
|
2
|
Houssein EH, Hammad A, Emam MM, Ali AA. An enhanced Coati Optimization Algorithm for global optimization and feature selection in EEG emotion recognition. Comput Biol Med 2024; 173:108329. [PMID: 38513391 DOI: 10.1016/j.compbiomed.2024.108329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 03/07/2024] [Accepted: 03/17/2024] [Indexed: 03/23/2024]
Abstract
Emotion recognition based on Electroencephalography (EEG) signals has garnered significant attention across diverse domains including healthcare, education, information sharing, and gaming, among others. Despite its potential, the absence of a standardized feature set poses a challenge in efficiently classifying various emotions. Addressing the issue of high dimensionality, this paper introduces an advanced variant of the Coati Optimization Algorithm (COA), called eCOA for global optimization and selecting the best subset of EEG features for emotion recognition. Specifically, COA suffers from local optima and imbalanced exploitation abilities as other metaheuristic methods. The proposed eCOA incorporates the COA and RUNge Kutta Optimizer (RUN) algorithms. The Scale Factor (SF) and Enhanced Solution Quality (ESQ) mechanism from RUN are applied to resolve the raised shortcomings of COA. The proposed eCOA algorithm has been extensively evaluated using the CEC'22 test suite and two EEG emotion recognition datasets, DEAP and DREAMER. Furthermore, the eCOA is applied for binary and multi-class classification of emotions in the dimensions of valence, arousal, and dominance using a multi-layer perceptron neural network (MLPNN). The experimental results revealed that the eCOA algorithm has more powerful search capabilities than the original COA and seven well-known counterpart methods related to statistical, convergence, and diversity measures. Furthermore, eCOA can efficiently support feature selection to find the best EEG features to maximize performance on four quadratic emotion classification problems compared to the methods of its counterparts. The suggested method obtains a classification accuracy of 85.17% and 95.21% in the binary classification of low and high arousal emotions in two public datasets: DEAP and DREAMER, respectively, which are 5.58% and 8.98% superior to existing approaches working on the same datasets for different subjects, respectively.
Collapse
Affiliation(s)
- Essam H Houssein
- Faculty of Computers and Information, Minia University, Minia, Egypt.
| | - Asmaa Hammad
- Faculty of Computers and Information, Minia University, Minia, Egypt.
| | - Marwa M Emam
- Faculty of Computers and Information, Minia University, Minia, Egypt.
| | - Abdelmgeid A Ali
- Faculty of Computers and Information, Minia University, Minia, Egypt.
| |
Collapse
|
3
|
Lu J, Fu H, Tang X, Liu Z, Huang J, Zou W, Chen H, Sun Y, Ning X, Li J. GOA-optimized deep learning for soybean yield estimation using multi-source remote sensing data. Sci Rep 2024; 14:7097. [PMID: 38528045 DOI: 10.1038/s41598-024-57278-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 03/15/2024] [Indexed: 03/27/2024] Open
Abstract
Accurately estimating large-area crop yields, especially for soybeans, is essential for addressing global food security challenges. This study introduces a deep learning framework that focuses on precise county-level soybean yield estimation in the United States. It utilizes a wide range of multi-variable remote sensing data. The model used in this study is a state-of-the-art CNN-BiGRU model, which is enhanced by the GOA and a novel attention mechanism (GCBA). This model excels in handling intricate time series and diverse remote sensing datasets. Compared to five leading machine learning and deep learning models, our GCBA model demonstrates superior performance, particularly in the 2019 and 2020 evaluations, achieving remarkable R2, RMSE, MAE and MAPE values. This sets a new benchmark in yield estimation accuracy. Importantly, the study highlights the significance of integrating multi-source remote sensing data. It reveals that synthesizing information from various sensors and incorporating photosynthesis-related parameters significantly enhances yield estimation precision. These advancements not only provide transformative insights for precision agricultural management but also establish a solid scientific foundation for informed decision-making in global agricultural production and food security.
Collapse
Affiliation(s)
- Jian Lu
- Institute of Smart Agriculture, Jilin Agricultural University, Changchun, 130118, People's Republic of China
| | - Hongkun Fu
- College of Agriculture, Jilin Agricultural University, Changchun, 130118, People's Republic of China
| | - Xuhui Tang
- College of Information Technology, Jilin Agricultural University, Changchun, 130118, People's Republic of China
| | - Zhao Liu
- Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun, 130102, People's Republic of China
| | - Jujian Huang
- College of Surveying and Exploration, Jilin Jianzhu University, Changchun, 130119, People's Republic of China
| | - Wenlong Zou
- Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun, 130102, People's Republic of China
| | - Hui Chen
- Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun, 130102, People's Republic of China
| | - Yue Sun
- Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun, 130102, People's Republic of China
| | - Xiangyu Ning
- Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun, 130102, People's Republic of China
| | - Jian Li
- Institute of Smart Agriculture, Jilin Agricultural University, Changchun, 130118, People's Republic of China.
| |
Collapse
|
4
|
Liu G, Guo Z, Liu W, Jiang F, Fu E. A feature selection method based on the Golden Jackal-Grey Wolf Hybrid Optimization Algorithm. PLoS One 2024; 19:e0295579. [PMID: 38165924 PMCID: PMC10760777 DOI: 10.1371/journal.pone.0295579] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Accepted: 11/20/2023] [Indexed: 01/04/2024] Open
Abstract
This paper proposes a feature selection method based on a hybrid optimization algorithm that combines the Golden Jackal Optimization (GJO) and Grey Wolf Optimizer (GWO). The primary objective of this method is to create an effective data dimensionality reduction technique for eliminating redundant, irrelevant, and noisy features within high-dimensional datasets. Drawing inspiration from the Chinese idiom "Chai Lang Hu Bao," hybrid algorithm mechanisms, and cooperative behaviors observed in natural animal populations, we amalgamate the GWO algorithm, the Lagrange interpolation method, and the GJO algorithm to propose the multi-strategy fusion GJO-GWO algorithm. In Case 1, the GJO-GWO algorithm addressed eight complex benchmark functions. In Case 2, GJO-GWO was utilized to tackle ten feature selection problems. Experimental results consistently demonstrate that under identical experimental conditions, whether solving complex benchmark functions or addressing feature selection problems, GJO-GWO exhibits smaller means, lower standard deviations, higher classification accuracy, and reduced execution times. These findings affirm the superior optimization performance, classification accuracy, and stability of the GJO-GWO algorithm.
Collapse
Affiliation(s)
- Guangwei Liu
- College of Mining, Liaoning Technical University, Fuxin, Liaoning, China
| | - Zhiqing Guo
- College of Mining, Liaoning Technical University, Fuxin, Liaoning, China
| | - Wei Liu
- College of Science, Liaoning Technical University, Fuxin, Liaoning, China
| | - Feng Jiang
- College of Science, Liaoning Technical University, Fuxin, Liaoning, China
| | - Ensan Fu
- College of Mining, Liaoning Technical University, Fuxin, Liaoning, China
| |
Collapse
|
5
|
Caselli N, Soto R, Crawford B, Valdivia S, Chicata E, Olivares R. Dynamic Population on Bio-Inspired Algorithms Using Machine Learning for Global Optimization. Biomimetics (Basel) 2023; 9:7. [PMID: 38248581 PMCID: PMC11154490 DOI: 10.3390/biomimetics9010007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 12/20/2023] [Accepted: 12/21/2023] [Indexed: 01/23/2024] Open
Abstract
In the optimization field, the ability to efficiently tackle complex and high-dimensional problems remains a persistent challenge. Metaheuristic algorithms, with a particular emphasis on their autonomous variants, are emerging as promising tools to overcome this challenge. The term "autonomous" refers to these variants' ability to dynamically adjust certain parameters based on their own outcomes, without external intervention. The objective is to leverage the advantages and characteristics of an unsupervised machine learning clustering technique to configure the population parameter with autonomous behavior, and emphasize how we incorporate the characteristics of search space clustering to enhance the intensification and diversification of the metaheuristic. This allows dynamic adjustments based on its own outcomes, whether by increasing or decreasing the population in response to the need for diversification or intensification of solutions. In this manner, it aims to imbue the metaheuristic with features for a broader search of solutions that can yield superior results. This study provides an in-depth examination of autonomous metaheuristic algorithms, including Autonomous Particle Swarm Optimization, Autonomous Cuckoo Search Algorithm, and Autonomous Bat Algorithm. We submit these algorithms to a thorough evaluation against their original counterparts using high-density functions from the well-known CEC LSGO benchmark suite. Quantitative results revealed performance enhancements in the autonomous versions, with Autonomous Particle Swarm Optimization consistently outperforming its peers in achieving optimal minimum values. Autonomous Cuckoo Search Algorithm and Autonomous Bat Algorithm also demonstrated noteworthy advancements over their traditional counterparts. A salient feature of these algorithms is the continuous nature of their population, which significantly bolsters their capability to navigate complex and high-dimensional search spaces. However, like all methodologies, there were challenges in ensuring consistent performance across all test scenarios. The intrinsic adaptability and autonomous decision making embedded within these algorithms herald a new era of optimization tools suited for complex real-world challenges. In sum, this research accentuates the potential of autonomous metaheuristics in the optimization arena, laying the groundwork for their expanded application across diverse challenges and domains. We recommend further explorations and adaptations of these autonomous algorithms to fully harness their potential.
Collapse
Affiliation(s)
- Nicolás Caselli
- Escuela de Ingeniería Informática, Pontificia Universidad Católica de Valparaíso, Valparaíso 2362807, Chile; (B.C.); (E.C.)
| | - Ricardo Soto
- Escuela de Ingeniería Informática, Pontificia Universidad Católica de Valparaíso, Valparaíso 2362807, Chile; (B.C.); (E.C.)
| | - Broderick Crawford
- Escuela de Ingeniería Informática, Pontificia Universidad Católica de Valparaíso, Valparaíso 2362807, Chile; (B.C.); (E.C.)
| | - Sergio Valdivia
- Departamento de Tecnologías de Información y Comunicación, Universidad de Valparaíso, Valparaíso 2361864, Chile;
| | - Elizabeth Chicata
- Escuela de Ingeniería Informática, Pontificia Universidad Católica de Valparaíso, Valparaíso 2362807, Chile; (B.C.); (E.C.)
| | - Rodrigo Olivares
- Escuela de Ingeniería Informática, Universidad de Valparaíso, Valparaíso 2362905, Chile;
| |
Collapse
|
6
|
Jia L, Wang T, Gad AG, Salem A. A weighted-sum chaotic sparrow search algorithm for interdisciplinary feature selection and data classification. Sci Rep 2023; 13:14061. [PMID: 37640716 PMCID: PMC10462760 DOI: 10.1038/s41598-023-38252-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Accepted: 07/05/2023] [Indexed: 08/31/2023] Open
Abstract
In today's data-driven digital culture, there is a critical demand for optimized solutions that essentially reduce operating expenses while attempting to increase productivity. The amount of memory and processing time that can be used to process enormous volumes of data are subject to a number of limitations. This would undoubtedly be more of a problem if a dataset contained redundant and uninteresting information. For instance, many datasets contain a number of non-informative features that primarily deceive a given classification algorithm. In order to tackle this, researchers have been developing a variety of feature selection (FS) techniques that aim to eliminate unnecessary information from the raw datasets before putting them in front of a machine learning (ML) algorithm. Meta-heuristic optimization algorithms are often a solid choice to solve NP-hard problems like FS. In this study, we present a wrapper FS technique based on the sparrow search algorithm (SSA), a type of meta-heuristic. SSA is a swarm intelligence (SI) method that stands out because of its quick convergence and improved stability. SSA does have some drawbacks, like lower swarm diversity and weak exploration ability in late iterations, like the majority of SI algorithms. So, using ten chaotic maps, we try to ameliorate SSA in three ways: (i) the initial swarm generation; (ii) the substitution of two random variables in SSA; and (iii) clamping the sparrows crossing the search range. As a result, we get CSSA, a chaotic form of SSA. Extensive comparisons show CSSA to be superior in terms of swarm diversity and convergence speed in solving various representative functions from the Institute of Electrical and Electronics Engineers (IEEE) Congress on Evolutionary Computation (CEC) benchmark set. Furthermore, experimental analysis of CSSA on eighteen interdisciplinary, multi-scale ML datasets from the University of California Irvine (UCI) data repository, as well as three high-dimensional microarray datasets, demonstrates that CSSA outperforms twelve state-of-the-art algorithms in a classification task based on FS discipline. Finally, a 5%-significance-level statistical post-hoc analysis based on Wilcoxon's signed-rank test, Friedman's rank test, and Nemenyi's test confirms CSSA's significance in terms of overall fitness, classification accuracy, selected feature size, computational time, convergence trace, and stability.
Collapse
Affiliation(s)
- LiYun Jia
- Department of Mathematics and Physics, Hebei University of Architecture, Zhangjiakou, 075000, China
| | - Tao Wang
- Department of Mathematics and Physics, Hebei University of Architecture, Zhangjiakou, 075000, China
| | - Ahmed G Gad
- Faculty of Computers and Information, Kafrelsheikh University, Kafrelsheikh, 33516, Egypt.
| | - Ahmed Salem
- College of Computing and Information Technology, Arab Academy for Science, Technology and Maritime Transport (AASTMT), Cairo, Egypt
| |
Collapse
|
7
|
Fan Y, Yang H, Wang Y, Xu Z, Lu D. A Variable Step Crow Search Algorithm and Its Application in Function Problems. Biomimetics (Basel) 2023; 8:395. [PMID: 37754146 PMCID: PMC10526407 DOI: 10.3390/biomimetics8050395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 08/16/2023] [Accepted: 08/24/2023] [Indexed: 09/28/2023] Open
Abstract
Optimization algorithms are popular to solve different problems in many fields, and are inspired by natural principles, animal living habits, plant pollinations, chemistry principles, and physic principles. Optimization algorithm performances will directly impact on solving accuracy. The Crow Search Algorithm (CSA) is a simple and efficient algorithm inspired by the natural behaviors of crows. However, the flight length of CSA is a fixed value, which makes the algorithm fall into the local optimum, severely limiting the algorithm solving ability. To solve this problem, this paper proposes a Variable Step Crow Search Algorithm (VSCSA). The proposed algorithm uses the cosine function to enhance CSA searching abilities, which greatly improves both the solution quality of the population and the convergence speed. In the update phase, the VSCSA increases population diversities and enhances the global searching ability of the basic CSA. The experiment used 14 test functions,2017 CEC functions, and engineering application problems to compare VSCSA with different algorithms. The experiment results showed that VSCSA performs better in fitness values, iteration curves, box plots, searching paths, and the Wilcoxon test results, which indicates that VSCSA has strong competitiveness and sufficient superiority. The VSCSA has outstanding performances in various test functions and the searching accuracy has been greatly improved.
Collapse
Affiliation(s)
| | | | - Yaping Wang
- Key Laboratory of Advanced Manufacturing and Intelligent Technology, Ministry of Education, School of Mechanical and Power Engineering, Harbin University of Science and Technology, Harbin 150080, China
| | | | | |
Collapse
|
8
|
Ferahtia S, Houari A, Rezk H, Djerioui A, Machmoum M, Motahhir S, Ait-Ahmed M. Red-tailed hawk algorithm for numerical optimization and real-world problems. Sci Rep 2023; 13:12950. [PMID: 37558724 PMCID: PMC10412609 DOI: 10.1038/s41598-023-38778-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Accepted: 07/14/2023] [Indexed: 08/11/2023] Open
Abstract
This study suggests a new nature-inspired metaheuristic optimization algorithm called the red-tailed hawk algorithm (RTH). As a predator, the red-tailed hawk has a hunting strategy from detecting the prey until the swoop stage. There are three stages during the hunting process. In the high soaring stage, the red-tailed hawk explores the search space and determines the area with the prey location. In the low soaring stage, the red-tailed moves inside the selected area around the prey to choose the best position for the hunt. Then, the red-tailed swings and hits its target in the stooping and swooping stages. The proposed algorithm mimics the prey-hunting method of the red-tailed hawk for solving real-world optimization problems. The performance of the proposed RTH algorithm has been evaluated on three classes of problems. The first class includes three specific kinds of optimization problems: 22 standard benchmark functions, including unimodal, multimodal, and fixed-dimensional multimodal functions, IEEE Congress on Evolutionary Computation 2020 (CEC2020), and IEEE CEC2022. The proposed algorithm is compared with eight recent algorithms to confirm its contribution to solving these problems. The considered algorithms are Farmland Fertility Optimizer (FO), African Vultures Optimization Algorithm (AVOA), Mountain Gazelle Optimizer (MGO), Gorilla Troops Optimizer (GTO), COOT algorithm, Hunger Games Search (HGS), Aquila Optimizer (AO), and Harris Hawks optimization (HHO). The results are compared regarding the accuracy, robustness, and convergence speed. The second class includes seven real-world engineering problems that will be considered to investigate the RTH performance compared to other published results profoundly. Finally, the proton exchange membrane fuel cell (PEMFC) extraction parameters will be performed to evaluate the algorithm with a complex problem. The proposed algorithm will be compared with several published papers to approve its performance. The ultimate results for each class confirm the ability of the proposed RTH algorithm to provide higher performance for most cases. For the first class, the RTH mostly got the optimal solutions for most functions with faster convergence speed. The RTH provided better performance for the second and third classes when resolving the real word engineering problems or extracting the PEMFC parameters.
Collapse
Affiliation(s)
- Seydali Ferahtia
- Institut de Recherche en Énergie Électrique de Nantes Atlantique, IREENA, Nantes University, Saint-Nazaire, France
- Laboratoire de Génie Electrique, Dept. of Electrical Engineering, University of M'sila, M'sila, Algeria
| | - Azeddine Houari
- Institut de Recherche en Énergie Électrique de Nantes Atlantique, IREENA, Nantes University, Saint-Nazaire, France
| | - Hegazy Rezk
- College of Engineering at Wadi Addawaser, Prince Sattam Bin Abdulaziz University, Al-Kharj, Saudi Arabia
| | - Ali Djerioui
- Laboratoire de Génie Electrique, Dept. of Electrical Engineering, University of M'sila, M'sila, Algeria
| | - Mohamed Machmoum
- Institut de Recherche en Énergie Électrique de Nantes Atlantique, IREENA, Nantes University, Saint-Nazaire, France
| | - Saad Motahhir
- ENSA, University of Sidi Mohamed Ben Abdellah, Fez, Morocco.
| | - Mourad Ait-Ahmed
- Institut de Recherche en Énergie Électrique de Nantes Atlantique, IREENA, Nantes University, Saint-Nazaire, France
| |
Collapse
|
9
|
Abd El-Mageed AA, Abohany AA, Elashry A. Effective Feature Selection Strategy for Supervised Classification based on an Improved Binary Aquila Optimization Algorithm. COMPUTERS & INDUSTRIAL ENGINEERING 2023; 181:109300. [DOI: 10.1016/j.cie.2023.109300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
|
10
|
AbdelAty AM, Yousri D, Chelloug S, Alduailij M, Abd Elaziz M. Fractional order adaptive hunter-prey optimizer for feature selection. ALEXANDRIA ENGINEERING JOURNAL 2023; 75:531-547. [DOI: 10.1016/j.aej.2023.05.092] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
|
11
|
Cai C, Gou B, Khishe M, Mohammadi M, Rashidi S, Moradpour R, Mirjalili S. Improved deep convolutional neural networks using chimp optimization algorithm for Covid19 diagnosis from the X-ray images. EXPERT SYSTEMS WITH APPLICATIONS 2023; 213:119206. [PMID: 36348736 DOI: 10.1016/j.eswa.2020.113338] [Citation(s) in RCA: 181] [Impact Index Per Article: 181.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 09/17/2022] [Accepted: 10/31/2022] [Indexed: 05/25/2023]
Abstract
Applying Deep Learning (DL) in radiological images (i.e., chest X-rays) is emerging because of the necessity of having accurate and fast COVID-19 detectors. Deep Convolutional Neural Networks (DCNN) have been typically used as robust COVID-19 positive case detectors in these approaches. Such DCCNs tend to utilize Gradient Descent-Based (GDB) algorithms as the last fully-connected layers' trainers. Although GDB training algorithms have simple structures and fast convergence rates for cases with large training samples, they suffer from the manual tuning of numerous parameters, getting stuck in local minima, large training samples set requirements, and inherently sequential procedures. It is exceedingly challenging to parallelize them with Graphics Processing Units (GPU). Consequently, the Chimp Optimization Algorithm (ChOA) is presented for training the DCNN's fully connected layers in light of the scarcity of a big COVID-19 training dataset and for the purpose of developing a fast COVID-19 detector with the capability of parallel implementation. In addition, two publicly accessible datasets termed COVID-Xray-5 k and COVIDetectioNet are used to benchmark the proposed detector known as DCCN-Chimp. In order to make a fair comparison, two structures are proposed: i-6c-2 s-12c-2 s and i-8c-2 s-16c-2 s, all of which have had their hyperparameters fine-tuned. The outcomes are evaluated in comparison to standard DCNN, Hybrid DCNN plus Genetic Algorithm (DCNN-GA), and Matched Subspace classifier with Adaptive Dictionaries (MSAD). Due to the large variation in results, we employ a weighted average of the ensemble of ten trained DCNN-ChOA, with the validation accuracy of the weights being used to determine the final weights. The validation accuracy for the mixed ensemble DCNN-ChOA is 99.11%. LeNet-5 DCNN's ensemble detection accuracy on COVID-19 is 84.58%. Comparatively, the suggested DCNN-ChOA yields over 99.11% accurate detection with a false alarm rate of less than 0.89%. The outcomes show that the DCCN-Chimp can deliver noticeably superior results than the comparable detectors. The Class Activation Map (CAM) is another tool used in this study to identify probable COVID-19-infected areas. Results show that highlighted regions are completely connected with clinical outcomes, which has been verified by experts.
Collapse
Affiliation(s)
- Chengfeng Cai
- School of Mechanical Engineering, Northwestern Polytechnical University, Xi'an 710072, China
| | - Bingchen Gou
- School of Mechanical Engineering, Northwestern Polytechnical University, Xi'an 710072, China
| | - Mohammad Khishe
- Departement of Electrical Engineering, Imam Khomeini Marine Science University, Nowshahr, Iran
| | - Mokhtar Mohammadi
- Department of Information Technology, College of Engineering and Computer Science, Lebanese French University, Kurdistan Region, Iraq
| | - Shima Rashidi
- Department of Computer Science, College of Science and Technology, University of Human Development, Sulaymaniyah, Kurdistan Region, Iraq
| | - Reza Moradpour
- Departement of Electrical Engineering, Imam Khomeini Marine Science University, Nowshahr, Iran
| | - Seyedali Mirjalili
- Centre for Artificial Intelligence Research and Optimization, Torrens University, Australia
- University Research and Innovation Center, Obuda University, 1034 Budapest, Hungary
| |
Collapse
|
12
|
Alzaqebah A, Al-Kadi O, Aljarah I. An enhanced Harris hawk optimizer based on extreme learning machine for feature selection. PROGRESS IN ARTIFICIAL INTELLIGENCE 2023. [DOI: 10.1007/s13748-023-00298-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2023]
|
13
|
Sadeghian Z, Akbari E, Nematzadeh H, Motameni H. A review of feature selection methods based on meta-heuristic algorithms. J EXP THEOR ARTIF IN 2023. [DOI: 10.1080/0952813x.2023.2183267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2023]
Affiliation(s)
- Zohre Sadeghian
- Department of Computer Engineering, Sari Branch, Islamic Azad University, Sari, Iran
| | - Ebrahim Akbari
- Department of Computer Engineering, Sari Branch, Islamic Azad University, Sari, Iran
| | - Hossein Nematzadeh
- Department of Computer Engineering, Sari Branch, Islamic Azad University, Sari, Iran
| | - Homayun Motameni
- Department of Computer Engineering, Sari Branch, Islamic Azad University, Sari, Iran
| |
Collapse
|
14
|
Devi RM, Premkumar M, Kiruthiga G, Sowmya R. IGJO: An Improved Golden Jackel Optimization Algorithm Using Local Escaping Operator for Feature Selection Problems. Neural Process Lett 2023. [DOI: 10.1007/s11063-023-11146-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
|
15
|
Mafarja M, Thaher T, Al-Betar MA, Too J, Awadallah MA, Abu Doush I, Turabieh H. Classification framework for faulty-software using enhanced exploratory whale optimizer-based feature selection scheme and random forest ensemble learning. APPL INTELL 2023; 53:1-43. [PMID: 36785593 PMCID: PMC9909674 DOI: 10.1007/s10489-022-04427-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/23/2022] [Indexed: 02/11/2023]
Abstract
Software Fault Prediction (SFP) is an important process to detect the faulty components of the software to detect faulty classes or faulty modules early in the software development life cycle. In this paper, a machine learning framework is proposed for SFP. Initially, pre-processing and re-sampling techniques are applied to make the SFP datasets ready to be used by ML techniques. Thereafter seven classifiers are compared, namely K-Nearest Neighbors (KNN), Naive Bayes (NB), Linear Discriminant Analysis (LDA), Linear Regression (LR), Decision Tree (DT), Support Vector Machine (SVM), and Random Forest (RF). The RF classifier outperforms all other classifiers in terms of eliminating irrelevant/redundant features. The performance of RF is improved further using a dimensionality reduction method called binary whale optimization algorithm (BWOA) to eliminate the irrelevant/redundant features. Finally, the performance of BWOA is enhanced by hybridizing the exploration strategies of the grey wolf optimizer (GWO) and harris hawks optimization (HHO) algorithms. The proposed method is called SBEWOA. The SFP datasets utilized are selected from the PROMISE repository using sixteen datasets for software projects with different sizes and complexity. The comparative evaluation against nine well-established feature selection methods proves that the proposed SBEWOA is able to significantly produce competitively superior results for several instances of the evaluated dataset. The algorithms' performance is compared in terms of accuracy, the number of features, and fitness function. This is also proved by the 2-tailed P-values of the Wilcoxon signed ranks statistical test used. In conclusion, the proposed method is an efficient alternative ML method for SFP that can be used for similar problems in the software engineering domain.
Collapse
Affiliation(s)
- Majdi Mafarja
- Department of Computer Science, Birzeit University, Birzeit, Palestine
| | - Thaer Thaher
- Department of Computer Systems Engineering, Arab American University, Jenin, Palestine
- Information Technology Engineering, Al-Quds University, Abu Dies, Jerusalem, Palestine
| | - Mohammed Azmi Al-Betar
- Artificial Intelligence Research Center (AIRC), College of Engineering and Information Technology, Ajman University, Ajman, United Arab EmiratesDeepSinghML2017, Irbid, Jordan
| | - Jingwei Too
- Faculty of Electrical Engineering, Universiti Teknikal Malaysia Melaka, Hang Tuah Jaya, 76100 Durian Tunggal Melaka, Malaysia
| | - Mohammed A. Awadallah
- Department of Computer Science, Al-Aqsa University, P.O. Box 4051, Gaza, Palestine
- Artificial Intelligence Research Center (AIRC), Ajman University, Ajman, United Arab Emirates
| | - Iyad Abu Doush
- Department of Computing, College of Engineering and Applied Sciences, American University of Kuwait, Salmiya, Kuwait
- Computer Science Department, Yarmouk University, Irbid, Jordan
| | - Hamza Turabieh
- Department of Health Management and Informatics, University of Missouri, Columbia, 5 Hospital Drive, Columbia, MO 65212 USA
| |
Collapse
|
16
|
Wani JA, Ganaie SA. The scientific outcome in the domain of grey literature: bibliometric mapping and visualisation using the R-bibliometrix package and the VOSviewer. LIBRARY HI TECH 2022. [DOI: 10.1108/lht-01-2022-0012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
PurposeThe current study aims to map the scientific output of grey literature (GL) through bibliometric approaches.Design/methodology/approachThe source for data extraction is a comprehensive “indexing and abstracting” database, “Web of Science” (WOS). A lexical title search was applied to get the corpus of the study – a total of 4,599 articles were extracted for data analysis and visualisation. Further, the data were analysed by using the data analytical tools, R-studio and VOSViewer.FindingsThe findings showed that the “publications” have substantially grown up during the timeline. The most productive phase (2018–2021) resulted in 47% of articles. The prominent sources were PLOS One and NeuroImage. The highest number of papers were contributed by Haddaway and Kumar. The most relevant countries were the USA and UK.Practical implicationsThe study is useful for researchers interested in the GL research domain. The study helps to understand the evolution of the GL to provide research support further in this area.Originality/valueThe present study provides a new orientation to the scholarly output of the GL. The study is rigorous and all-inclusive based on analytical operations like the research networks, collaboration and visualisation. To the best of the authors' knowledge, this manuscript is original, and no similar works have been found with the research objectives included here.
Collapse
|
17
|
Wang Z, Gao S, Zhang Y, Guo L. Symmetric uncertainty-incorporated probabilistic sequence-based ant colony optimization for feature selection in classification. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
18
|
Abed-alguni BH, Alawad NA, Al-Betar MA, Paul D. Opposition-based sine cosine optimizer utilizing refraction learning and variable neighborhood search for feature selection. APPL INTELL 2022; 53:13224-13260. [PMID: 36247211 PMCID: PMC9547101 DOI: 10.1007/s10489-022-04201-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/21/2022] [Indexed: 12/03/2022]
Abstract
This paper proposes new improved binary versions of the Sine Cosine Algorithm (SCA) for the Feature Selection (FS) problem. FS is an essential machine learning and data mining task of choosing a subset of highly discriminating features from noisy, irrelevant, high-dimensional, and redundant features to best represent a dataset. SCA is a recent metaheuristic algorithm established to emulate a model based on sine and cosine trigonometric functions. It was initially proposed to tackle problems in the continuous domain. The SCA has been modified to Binary SCA (BSCA) to deal with the binary domain of the FS problem. To improve the performance of BSCA, three accumulative improved variations are proposed (i.e., IBSCA1, IBSCA2, and IBSCA3) where the last version has the best performance. IBSCA1 employs Opposition Based Learning (OBL) to help ensure a diverse population of candidate solutions. IBSCA2 improves IBSCA1 by adding Variable Neighborhood Search (VNS) and Laplace distribution to support several mutation methods. IBSCA3 improves IBSCA2 by optimizing the best candidate solution using Refraction Learning (RL), a novel OBL approach based on light refraction. For performance evaluation, 19 real-wold datasets, including a COVID-19 dataset, were selected with different numbers of features, classes, and instances. Three performance measurements have been used to test the IBSCA versions: classification accuracy, number of features, and fitness values. Furthermore, the performance of the last variation of IBSCA3 is compared against 28 existing popular algorithms. Interestingly, IBCSA3 outperformed almost all comparative methods in terms of classification accuracy and fitness values. At the same time, it was ranked 15 out of 19 in terms of number of features. The overall simulation and statistical results indicate that IBSCA3 performs better than the other algorithms.
Collapse
Affiliation(s)
| | | | - Mohammed Azmi Al-Betar
- Artificial Intelligence Research Center (AIRC), College of Engineering and Information Technology, Ajman University, Ajman, United Arab Emirates
| | - David Paul
- School of Science and Technology, University of New England, Armidale, Australia
| |
Collapse
|
19
|
An Efficient High-dimensional Feature Selection Approach Driven By Enhanced Multi-strategy Grey Wolf Optimizer for Biological Data Classification. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07836-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/10/2022]
|
20
|
Elaziz MA, Ahmadein M, Ataya S, Alsaleh N, Forestiero A, Elsheikh AH. A Quantum-Based Chameleon Swarm for Feature Selection. MATHEMATICS 2022; 10:3606. [DOI: 10.3390/math10193606] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
Abstract
The Internet of Things is widely used, which results in the collection of enormous amounts of data with numerous redundant, irrelevant, and noisy features. In addition, many of these features need to be managed. Consequently, developing an effective feature selection (FS) strategy becomes a difficult goal. Many FS techniques, based on bioinspired metaheuristic methods, have been developed to tackle this problem. However, these methods still suffer from limitations; so, in this paper, we developed an alternative FS technique, based on integrating operators of the chameleon swarm algorithm (Cham) with the quantum-based optimization (QBO) technique. With the use of eighteen datasets from various real-world applications, we proposed that QCham is investigated and compared to well-known FS methods. The comparisons demonstrate the benefits of including a QBO operator in the Cham because the proposed QCham can efficiently and accurately detect the most crucial features. Whereas the QCham achieves nearly 92.6%, with CPU time(s) nearly 1.7 overall the tested datasets. This indicates the advantages of QCham among comparative algorithms and high efficiency of integrating the QBO with the operators of Cham algorithm that used to enhance the process of balancing between exploration and exploitation.
Collapse
|
21
|
Improved firefly algorithm for feature selection with the ReliefF-based initialization and the weighted voting mechanism. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07755-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
22
|
A Novel Hybrid GOA-XGB Model for Estimating Wheat Aboveground Biomass Using UAV-Based Multispectral Vegetation Indices. REMOTE SENSING 2022. [DOI: 10.3390/rs14143506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
The rapid and nondestructive determination of wheat aboveground biomass (AGB) is important for accurate and efficient agricultural management. In this study, we established a novel hybrid model, known as extreme gradient boosting (XGBoost) optimization using the grasshopper optimization algorithm (GOA-XGB), which could accurately determine an ideal combination of vegetation indices (VIs) for simulating wheat AGB. Five multispectral bands of the unmanned aerial vehicle platform and 56 types of VIs obtained based on the five bands were used to drive the new model. The GOA-XGB model was compared with many state-of-the-art models, for example, multiple linear regression (MLR), multilayer perceptron (MLP), gradient boosting decision tree (GBDT), Gaussian process regression (GPR), random forest (RF), support vector machine (SVM), XGBoost, SVM optimization by particle swarm optimization (PSO), SVM optimization by the whale optimization algorithm (WOA), SVM optimization by the GOA (GOA-SVM), XGBoost optimization by PSO, XGBoost optimization by the WOA. The results demonstrated that MLR and GOA-MLR models had poor prediction accuracy for AGB, and the accuracy did not significantly improve when input factors were more than three. Among single-factor-driven machine learning (ML) models, the GPR model had the highest accuracy, followed by the XGBoost model. When the input combinations of multispectral bands and VIs were used, the GOA-XGB model (having 37 input factors) had the highest accuracy, with RMSE = 0.232 kg m−2, R2 = 0.847, MAE = 0.178 kg m−2, and NRMSE = 0.127. When the XGBoost feature selection was used to reduce the input factors to 16, the model accuracy improved further to RMSE = 0.226 kg m−2, R2 = 0.855, MAE = 0.172 kg m−2, and NRMSE = 0.123. Based on the developed model, the average AGB of the plot was 1.49 ± 0.34 kg.
Collapse
|
23
|
Self-adaptive salp swarm algorithm for optimization problems. Soft comput 2022. [DOI: 10.1007/s00500-022-07280-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
24
|
Dokeroglu T, Deniz A, Kiziloz HE. A comprehensive survey on recent metaheuristics for feature selection. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.04.083] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|
25
|
|
26
|
Abd Elaziz M, Ouadfel S, Abd El-Latif AA, Ali Ibrahim R. Feature Selection Based on Modified Bio-inspired Atomic Orbital Search Using Arithmetic Optimization and Opposite-Based Learning. Cognit Comput 2022. [DOI: 10.1007/s12559-022-10022-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
27
|
EBBA: An Enhanced Binary Bat Algorithm Integrated with Chaos Theory and Lévy Flight for Feature Selection. FUTURE INTERNET 2022. [DOI: 10.3390/fi14060178] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
Feature selection can efficiently improve classification accuracy and reduce the dimension of datasets. However, feature selection is a challenging and complex task that requires a high-performance optimization algorithm. In this paper, we propose an enhanced binary bat algorithm (EBBA) which is originated from the conventional binary bat algorithm (BBA) as the learning algorithm in a wrapper-based feature selection model. First, we model the feature selection problem and then transfer it as a fitness function. Then, we propose an EBBA for solving the feature selection problem. In EBBA, we introduce the Lévy flight-based global search method, population diversity boosting method and chaos-based loudness method to improve the BA and make it more applicable to feature selection problems. Finally, the simulations are conducted to evaluate the proposed EBBA and the simulation results demonstrate that the proposed EBBA outmatches other comparison benchmarks. Moreover, we also illustrate the effectiveness of the proposed improved factors by tests.
Collapse
|
28
|
An enhanced binary Rat Swarm Optimizer based on local-best concepts of PSO and collaborative crossover operators for feature selection. Comput Biol Med 2022; 147:105675. [PMID: 35687926 DOI: 10.1016/j.compbiomed.2022.105675] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 05/24/2022] [Accepted: 05/26/2022] [Indexed: 11/22/2022]
Abstract
In this paper, an enhanced binary version of the Rat Swarm Optimizer (RSO) is proposed to deal with Feature Selection (FS) problems. FS is an important data reduction step in data mining which finds the most representative features from the entire data. Many FS-based swarm intelligence algorithms have been used to tackle FS. However, the door is still open for further investigations since no FS method gives cutting-edge results for all cases. In this paper, a recent swarm intelligence metaheuristic method called RSO which is inspired by the social and hunting behavior of a group of rats is enhanced and explored for FS problems. The binary enhanced RSO is built based on three successive modifications: i) an S-shape transfer function is used to develop binary RSO algorithms; ii) the local search paradigm of particle swarm optimization is used with the iterative loop of RSO to boost its local exploitation; iii) three crossover mechanisms are used and controlled by a switch probability to improve the diversity. Based on these enhancements, three versions of RSO are produced, referred to as Binary RSO (BRSO), Binary Enhanced RSO (BERSO), and Binary Enhanced RSO with Crossover operators (BERSOC). To assess the performance of these versions, a benchmark of 24 datasets from various domains is used. The proposed methods are assessed concerning the fitness value, number of selected features, classification accuracy, specificity, sensitivity, and computational time. The best performance is achieved by BERSOC followed by BERSO and then BRSO. These proposed versions are comparatively assessed against 25 well-regarded metaheuristic methods and five filter-based approaches. The obtained results underline their superiority by producing new best results for some datasets.
Collapse
|
29
|
Tubishat M, Rawshdeh Z, Jarrah H, Elgamal ZM, Elnagar A, Alrashdan MT. Dynamic generalized normal distribution optimization for feature selection. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07398-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
30
|
Liu W, Guo Z, Jiang F, Liu G, Wang D, Ni Z. Improved WOA and its application in feature selection. PLoS One 2022; 17:e0267041. [PMID: 35588402 PMCID: PMC9119564 DOI: 10.1371/journal.pone.0267041] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Accepted: 03/31/2022] [Indexed: 11/18/2022] Open
Abstract
Feature selection (FS) can eliminate many redundant, irrelevant, and noisy features in high-dimensional data to improve machine learning or data mining models' prediction, classification, and computational performance. We proposed an improved whale optimization algorithm (IWOA) and improved k-nearest neighbors (IKNN) classifier approaches for feature selection (IWOAIKFS). Firstly, WOA is improved by using chaotic elite reverse individual, probability selection of skew distribution, nonlinear adjustment of control parameters and position correction strategy to enhance the search performance of the algorithm for feature subsets. Secondly, the sample similarity measurement criterion and weighted voting criterion based on the simulated annealing algorithm to solve the weight matrix M are proposed to improve the KNN classifier and improve the evaluation performance of the algorithm on feature subsets. The experimental results show: IWOA not only has better optimization performance when solving benchmark functions of different dimensions, but also when used with IKNN for feature selection, IWOAIKFS has better classification and robustness.
Collapse
Affiliation(s)
- Wei Liu
- College of Science, Liaoning Technical University, Fuxin, Liaoning, China
| | - Zhiqing Guo
- College of Science, Liaoning Technical University, Fuxin, Liaoning, China
| | - Feng Jiang
- College of Science, Liaoning Technical University, Fuxin, Liaoning, China
| | - Guangwei Liu
- College of Mines, Liaoning Technical University, Fuxin, Liaoning, China
| | - Dong Wang
- College of Mines, Liaoning Technical University, Fuxin, Liaoning, China
| | - Zishun Ni
- College of Science, Liaoning Technical University, Fuxin, Liaoning, China
| |
Collapse
|
31
|
Backpropagation Neural Network optimization and software defect estimation modelling using a hybrid Salp Swarm optimizer-based Simulated Annealing Algorithm. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108511] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
32
|
Abd El-Mageed AA, Gad AG, Sallam KM, Munasinghe K, Abohany AA. Improved Binary Adaptive Wind Driven Optimization Algorithm-Based Dimensionality Reduction for Supervised Classification. COMPUTERS & INDUSTRIAL ENGINEERING 2022; 167:107904. [DOI: 10.1016/j.cie.2021.107904] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
|
33
|
Abstract
AbstractFeature Selection (FS) is an important preprocessing step that is involved in machine learning and data mining tasks for preparing data (especially high-dimensional data) by eliminating irrelevant and redundant features, thus reducing the potential curse of dimensionality of a given large dataset. Consequently, FS is arguably a combinatorial NP-hard problem in which the computational time increases exponentially with an increase in problem complexity. To tackle such a problem type, meta-heuristic techniques have been opted by an increasing number of scholars. Herein, a novel meta-heuristic algorithm, called Sparrow Search Algorithm (SSA), is presented. The SSA still performs poorly on exploratory behavior and exploration-exploitation trade-off because it does not duly stimulate the search within feasible regions, and the exploitation process suffers noticeable stagnation. Therefore, we improve SSA by adopting: i) a strategy for Random Re-positioning of Roaming Agents (3RA); and ii) a novel Local Search Algorithm (LSA), which are algorithmically incorporated into the original SSA structure. To the FS problem, SSA is improved and cloned as a binary variant, namely, the improved Binary SSA (iBSSA), which would strive to select the optimal or near-optimal features from a given dataset while keeping the classification accuracy maximized. For binary conversion, the iBSSA was primarily validated against nine common S-shaped and V-shaped Transfer Functions (TFs), thus producing nine iBSSA variants. To verify the robustness of these variants, three well-known classification techniques, including k-Nearest Neighbor (k-NN), Support Vector Machine (SVM), and Random Forest (RF) were adopted as fitness evaluators with the proposed iBSSA approach and many other competing algorithms, on 18 multifaceted, multi-scale benchmark datasets from the University of California Irvine (UCI) data repository. Then, the overall best-performing iBSSA variant for each of the three classifiers was compared with binary variants of 12 different well-known meta-heuristic algorithms, including the original SSA (BSSA), Artificial Bee Colony (BABC), Particle Swarm Optimization (BPSO), Bat Algorithm (BBA), Grey Wolf Optimization (BGWO), Whale Optimization Algorithm (BWOA), Grasshopper Optimization Algorithm (BGOA) SailFish Optimizer (BSFO), Harris Hawks Optimization (BHHO), Bird Swarm Algorithm (BBSA), Atom Search Optimization (BASO), and Henry Gas Solubility Optimization (BHGSO). Based on a Wilcoxon’s non-parametric statistical test ($$\alpha =0.05$$
α
=
0.05
), the superiority of iBSSA with the three classifiers was very evident against counterparts across the vast majority of the selected datasets, achieving a feature size reduction of up to 92% along with up to 100% classification accuracy on some of those datasets.
Collapse
|
34
|
Abstract
AbstractCommunication via email has expanded dramatically in recent decades due to its cost-effectiveness, convenience, speed, and utility for a variety of contexts, including social, scientific, cultural, political, authentication, and advertising applications. Spam is an email sent to a large number of individuals or organizations without the recipient's desire or request. It is increasingly becoming a harmful part of email traffic and can negatively affect the usability of email systems. Such emails consume network bandwidth as well as storage space, causing email systems to slow down, wasting time and effort scanning and eliminating enormous amounts of useless information. Spam is also used for distributing offensive and harmful content on the Internet. The objective of the current study was to develop a new method for email spam detection with high accuracy and a low error rate. There are several methods to recognize, detect, filter, categorize, and delete spam emails, and almost the majority of the proposed methods have some extent of error rate. None of the spam detection techniques, despite the optimizations performed, have been effective alone. A step in text mining and message classification is feature selection, and one of the best approaches for feature selection is the use of metaheuristic algorithms. This article introduces a new method for detecting spam using the Horse herd metaheuristic Optimization Algorithm (HOA). First, the continuous HOA was transformed into a discrete algorithm. The inputs of the resulting algorithm then became opposition-based and then converted to multiobjective. Finally, it was used for spam detection, which is a discrete and multiobjective problem. The evaluation results indicate that the proposed method performs better compared to other methods such as K-nearest neighbours-grey wolf optimisation, K-nearest neighbours, multilayer perceptron, support vector machine, and Naive Bayesian. The results show that the new multiobjective opposition-based binary horse herd optimizer, running on the UCI data set, has been more successful in the average selection size and classification accuracy compared with other standard metaheuristic methods. According to the findings, the proposed algorithm is substantially more accurate in detecting spam emails in the data set in comparison with other similar algorithms, and it shows lower computational complexity.
Collapse
|
35
|
Yin D, Chen D, Tang Y, Dong H, Li X. Adaptive feature selection with shapley and hypothetical testing: Case study of EEG feature engineering. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2021.11.063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
36
|
Liu Q, Liu M, Wang F, Xiao W. A dynamic stochastic search algorithm for high-dimensional optimization problems and its application to feature selection. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108517] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
|
37
|
Kareem SS, Mostafa RR, Hashim FA, El-Bakry HM. An Effective Feature Selection Model Using Hybrid Metaheuristic Algorithms for IoT Intrusion Detection. SENSORS 2022; 22:s22041396. [PMID: 35214297 PMCID: PMC8962996 DOI: 10.3390/s22041396] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Revised: 02/06/2022] [Accepted: 02/07/2022] [Indexed: 02/01/2023]
Abstract
The increasing use of Internet of Things (IoT) applications in various aspects of our lives has created a huge amount of data. IoT applications often require the presence of many technologies such as cloud computing and fog computing, which have led to serious challenges to security. As a result of the use of these technologies, cyberattacks are also on the rise because current security methods are ineffective. Several artificial intelligence (AI)-based security solutions have been presented in recent years, including intrusion detection systems (IDS). Feature selection (FS) approaches are required for the development of intelligent analytic tools that need data pretreatment and machine-learning algorithm-performance enhancement. By reducing the number of selected features, FS aims to improve classification accuracy. This article presents a new FS method through boosting the performance of Gorilla Troops Optimizer (GTO) based on the algorithm for bird swarms (BSA). This BSA is used to boost performance exploitation of GTO in the newly developed GTO-BSA because it has a strong ability to find feasible regions with optimal solutions. As a result, the quality of the final output will increase, improving convergence. GTO-BSA’s performance was evaluated using a variety of performance measures on four IoT-IDS datasets: NSL-KDD, CICIDS-2017, UNSW-NB15 and BoT-IoT. The results were compared to those of the original GTO, BSA, and several state-of-the-art techniques in the literature. According to the findings of the experiments, GTO-BSA had a better convergence rate and higher-quality solutions.
Collapse
Affiliation(s)
- Saif S. Kareem
- Department of Information Systems, Faculty of Computers and Information Sciences, Mansoura University, Mansoura 35516, Egypt; (S.S.K.); (H.M.E.-B.)
| | - Reham R. Mostafa
- Department of Information Systems, Faculty of Computers and Information Sciences, Mansoura University, Mansoura 35516, Egypt; (S.S.K.); (H.M.E.-B.)
- Correspondence:
| | - Fatma A. Hashim
- Faculty of Engineering, Helwan University, Cairo 11795, Egypt;
| | - Hazem M. El-Bakry
- Department of Information Systems, Faculty of Computers and Information Sciences, Mansoura University, Mansoura 35516, Egypt; (S.S.K.); (H.M.E.-B.)
| |
Collapse
|
38
|
Hichem H, Elkamel M, Rafik M, Mesaaoud MT, Ouahiba C. A new binary grasshopper optimization algorithm for feature selection problem. JOURNAL OF KING SAUD UNIVERSITY - COMPUTER AND INFORMATION SCIENCES 2022. [DOI: 10.1016/j.jksuci.2019.11.007] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
39
|
A Review of the Modification Strategies of the Nature Inspired Algorithms for Feature Selection Problem. MATHEMATICS 2022. [DOI: 10.3390/math10030464] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
This survey is an effort to provide a research repository and a useful reference for researchers to guide them when planning to develop new Nature-inspired Algorithms tailored to solve Feature Selection problems (NIAs-FS). We identified and performed a thorough literature review in three main streams of research lines: Feature selection problem, optimization algorithms, particularly, meta-heuristic algorithms, and modifications applied to NIAs to tackle the FS problem. We provide a detailed overview of 156 different articles about NIAs modifications for tackling FS. We support our discussions by analytical views, visualized statistics, applied examples, open-source software systems, and discuss open issues related to FS and NIAs. Finally, the survey summarizes the main foundations of NIAs-FS with approximately 34 different operators investigated. The most popular operator is chaotic maps. Hybridization is the most widely used modification technique. There are three types of hybridization: Integrating NIA with another NIA, integrating NIA with a classifier, and integrating NIA with a classifier. The most widely used hybridization is the one that integrates a classifier with the NIA. Microarray and medical applications are the dominated applications where most of the NIA-FS are modified and used. Despite the popularity of the NIAs-FS, there are still many areas that need further investigation.
Collapse
|
40
|
Learning-based monarch butterfly optimization algorithm for solving numerical optimization problems. Neural Comput Appl 2022. [DOI: 10.1007/s00521-021-06654-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
41
|
Binary Horse herd optimization algorithm with crossover operators for feature selection. Comput Biol Med 2021; 141:105152. [PMID: 34952338 DOI: 10.1016/j.compbiomed.2021.105152] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Revised: 12/11/2021] [Accepted: 12/14/2021] [Indexed: 01/30/2023]
Abstract
This paper proposes a binary version of Horse herd Optimization Algorithm (HOA) to tackle Feature Selection (FS) problems. This algorithm mimics the conduct of a pack of horses when they are trying to survive. To build a Binary version of HOA, or referred to as BHOA, twofold of adjustments were made: i) Three transfer functions, namely S-shape, V-shape and U-shape, are utilized to transform the continues domain into a binary one. Four configurations of each transfer function are also well studied to yield four alternatives. ii) Three crossover operators: one-point, two-point and uniform are also suggested to ensure the efficiency of the proposed method for FS domain. The performance of the proposed fifteen BHOA versions is examined using 24 real-world FS datasets. A set of six metric measures was used to evaluate the outcome of the optimization methods: accuracy, number of features selected, fitness values, sensitivity, specificity and computational time. The best-formed version of the proposed versions is BHOA with S-shape and one-point crossover. The comparative evaluation was also accomplished against 21 state-of-the-art methods. The proposed method is able to find very competitive results where some of them are the best-recorded. Due to the viability of the proposed method, it can be further considered in other areas of machine learning.
Collapse
|
42
|
An Improved Adaptive IVMD-WPT-Based Noise Reduction Algorithm on GPS Height Time Series. SENSORS 2021; 21:s21248295. [PMID: 34960391 PMCID: PMC8709023 DOI: 10.3390/s21248295] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 12/06/2021] [Accepted: 12/08/2021] [Indexed: 11/17/2022]
Abstract
To improve the reliability of Global Positioning System (GPS) signal extraction, the traditional variational mode decomposition (VMD) method cannot determine the number of intrinsic modal functions or the value of the penalty factor in the process of noise reduction, which leads to inadequate or over-decomposition in time series analysis and will cause problems. Therefore, in this paper, a new approach using improved variational mode decomposition and wavelet packet transform (IVMD-WPT) was proposed, which takes the energy entropy mutual information as the objective function and uses the grasshopper optimisation algorithm to optimise the objective function to adaptively determine the number of modal decompositions and the value of the penalty factor to verify the validity of the IVMD-WPT algorithm. We performed a test experiment with two groups of simulation time series and three indicators: root mean square error (RMSE), correlation coefficient (CC) and signal-to-noise ratio (SNR). These indicators were used to evaluate the noise reduction effect. The simulation results showed that IVMD-WPT was better than the traditional empirical mode decomposition and improved variational mode decomposition (IVMD) methods and that the RMSE decreased by 0.084 and 0.0715 mm; CC and SNR increased by 0.0005 and 0.0004 dB, and 862.28 and 6.17 dB, respectively. The simulation experiments verify the effectiveness of the proposed algorithm. Finally, we performed an analysis with 100 real GPS height time series from the Crustal Movement Observation Network of China (CMONOC). The results showed that the RMSE decreased by 11.4648 and 6.7322 mm, and CC and SNR increased by 0.1458 and 0.0588 dB, and 32.6773 and 26.3918 dB, respectively. In summary, the IVMD-WPT algorithm can adaptively determine the number of decomposition modal functions of VMD and the optimal combination of penalty factors; it helps to further extract effective information for noise and can perfectly retain useful information in the original time series.
Collapse
|
43
|
|
44
|
Lin CC, Kang JR, Liang YL, Kuo CC. Simultaneous feature and instance selection in big noisy data using memetic variable neighborhood search. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.107855] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
45
|
Ibrahim RA, Abd Elaziz M, Ewees AA, El-Abd M, Lu S. New feature selection paradigm based on hyper-heuristic technique. APPLIED MATHEMATICAL MODELLING 2021; 98:14-37. [DOI: 10.1016/j.apm.2021.04.018] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
|
46
|
A multi-leader Harris hawk optimization based on differential evolution for feature selection and prediction influenza viruses H1N1. Artif Intell Rev 2021. [DOI: 10.1007/s10462-021-10075-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
47
|
Ibrahim RA, Abualigah L, Ewees AA, Al-qaness MAA, Yousri D, Alshathri S, Abd Elaziz M. An Electric Fish-Based Arithmetic Optimization Algorithm for Feature Selection. ENTROPY 2021; 23:e23091189. [PMID: 34573818 PMCID: PMC8472813 DOI: 10.3390/e23091189] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 09/05/2021] [Accepted: 09/06/2021] [Indexed: 11/16/2022]
Abstract
With the widespread use of intelligent information systems, a massive amount of data with lots of irrelevant, noisy, and redundant features are collected; moreover, many features should be handled. Therefore, introducing an efficient feature selection (FS) approach becomes a challenging aim. In the recent decade, various artificial methods and swarm models inspired by biological and social systems have been proposed to solve different problems, including FS. Thus, in this paper, an innovative approach is proposed based on a hybrid integration between two intelligent algorithms, Electric fish optimization (EFO) and the arithmetic optimization algorithm (AOA), to boost the exploration stage of EFO to process the high dimensional FS problems with a remarkable convergence speed. The proposed EFOAOA is examined with eighteen datasets for different real-life applications. The EFOAOA results are compared with a set of recent state-of-the-art optimizers using a set of statistical metrics and the Friedman test. The comparisons show the positive impact of integrating the AOA operator in the EFO, as the proposed EFOAOA can identify the most important features with high accuracy and efficiency. Compared to the other FS methods whereas, it got the lowest features number and the highest accuracy in 50% and 67% of the datasets, respectively.
Collapse
Affiliation(s)
- Rehab Ali Ibrahim
- Department of Mathematics, Faculty of Science, Zagazig University, Zagazig 44519, Egypt; (R.A.I.); (M.A.E.)
| | - Laith Abualigah
- Faculty of Computer Sciences and Informatics, Amman Arab University, Amman 11953, Jordan;
| | - Ahmed A. Ewees
- Department of Computer, Damietta University, Damietta 34517, Egypt;
| | - Mohammed A. A. Al-qaness
- State Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China;
| | - Dalia Yousri
- Electrical Engineering Department, Faculty of Engineering, Fayoum University, Fayoum 63514, Egypt;
| | - Samah Alshathri
- Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh 84428, Saudi Arabia
- Correspondence:
| | - Mohamed Abd Elaziz
- Department of Mathematics, Faculty of Science, Zagazig University, Zagazig 44519, Egypt; (R.A.I.); (M.A.E.)
- Artificial Intelligence Research Center (AIRC), Ajman University, Ajman 346, United Arab Emirates
| |
Collapse
|
48
|
Huang Y, Shen Z, Cai F, Li T, Lv F. Adaptive graph-based generalized regression model for unsupervised feature selection. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.107156] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
49
|
A robust multiobjective Harris’ Hawks Optimization algorithm for the binary classification problem. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.107219] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
50
|
Abstract
Computer-aided diagnosis permits biopsy specimen analysis by creating quantitative images of brain diseases which enable the pathologists to examine the data properly. It has been observed from other image classification algorithms that the Extreme Learning Machine (ELM) demonstrates superior performance in terms of computational efforts. In this study, to classify the brain Magnetic Resonance Images as either normal or diseased, a hybridized Salp Swarm Algorithm-based ELM (ELM-SSA) is proposed. The SSA is employed to optimize the parameters associated with ELM model, whereas the Discrete Wavelet Transformation and Principal Component Analysis have been used for the feature extraction and reduction, respectively. The performance of the proposed “ELM-SSA” is evaluated through simulation study and compared with the standard classifiers such as Back-Propagation Neural Network, Functional Link Artificial Neural Network, and Radial Basis Function Network. All experimental validations have been carried out using two different brain disease datasets: Alzheimer’s and Hemorrhage. The simulation results demonstrate that the “ELM-SSA” is potentially superior to other hybrid methods in terms of ROC, AUC, and accuracy. To achieve better performance, reduce randomness, and overfitting, each algorithm has been run multiple times and a k-fold stratified cross-validation strategy has been used.
Collapse
|