1
|
Zhang L, Chen X. Social coevolution and Sine chaotic opposition learning Chimp Optimization Algorithm for feature selection. Sci Rep 2024; 14:15413. [PMID: 38965341 PMCID: PMC11224333 DOI: 10.1038/s41598-024-66285-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Accepted: 07/01/2024] [Indexed: 07/06/2024] Open
Abstract
Feature selection is a hot problem in machine learning. Swarm intelligence algorithms play an essential role in feature selection due to their excellent optimisation ability. The Chimp Optimisation Algorithm (CHoA) is a new type of swarm intelligence algorithm. It has quickly won widespread attention in the academic community due to its fast convergence speed and easy implementation. However, CHoA has specific challenges in balancing local and global search, limiting its optimisation accuracy and leading to premature convergence, thus affecting the algorithm's performance on feature selection tasks. This study proposes Social coevolution and Sine chaotic opposition learning Chimp Optimization Algorithm (SOSCHoA). SOSCHoA enhances inter-population interaction through social coevolution, improving local search. Additionally, it introduces sine chaotic opposition learning to increase population diversity and prevent local optima. Extensive experiments on 12 high-dimensional classification datasets demonstrate that SOSCHoA outperforms existing algorithms in classification accuracy, convergence, and stability. Although SOSCHoA shows advantages in handling high-dimensional datasets, there is room for future research and optimization, particularly concerning feature dimensionality reduction.
Collapse
Affiliation(s)
- Li Zhang
- College of Computer Engineering, Jiangsu University of Technology, Changzhou, 213001, People's Republic of China.
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, People's Republic of China.
| | - XiaoBo Chen
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, People's Republic of China
- People's Bank of China Changzhou City Center Branch, Jiangsu, 213001, Changzhou, People's Republic of China
| |
Collapse
|
2
|
Ye M, Zhou H, Yang H, Hu B, Wang X. Multi-Strategy Improved Dung Beetle Optimization Algorithm and Its Applications. Biomimetics (Basel) 2024; 9:291. [PMID: 38786501 PMCID: PMC11117942 DOI: 10.3390/biomimetics9050291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Revised: 05/03/2024] [Accepted: 05/05/2024] [Indexed: 05/25/2024] Open
Abstract
The dung beetle optimization (DBO) algorithm, a swarm intelligence-based metaheuristic, is renowned for its robust optimization capability and fast convergence speed. However, it also suffers from low population diversity, susceptibility to local optima solutions, and unsatisfactory convergence speed when facing complex optimization problems. In response, this paper proposes the multi-strategy improved dung beetle optimization algorithm (MDBO). The core improvements include using Latin hypercube sampling for better population initialization and the introduction of a novel differential variation strategy, termed "Mean Differential Variation", to enhance the algorithm's ability to evade local optima. Moreover, a strategy combining lens imaging reverse learning and dimension-by-dimension optimization was proposed and applied to the current optimal solution. Through comprehensive performance testing on standard benchmark functions from CEC2017 and CEC2020, MDBO demonstrates superior performance in terms of optimization accuracy, stability, and convergence speed compared with other classical metaheuristic optimization algorithms. Additionally, the efficacy of MDBO in addressing complex real-world engineering problems is validated through three representative engineering application scenarios namely extension/compression spring design problems, reducer design problems, and welded beam design problems.
Collapse
Affiliation(s)
- Mingjun Ye
- School of Information Science and Technology, Yunnan Normal University, Kunming 650500, China
| | - Heng Zhou
- Department of Internet of Things and Artificial Intelligence, Wuxi Vocational College of Science and Technology, Wuxi 214028, China
| | - Haoyu Yang
- College of Engineering, Informatics, and Applied Sciences, Flagstaff, AZ 86011, USA
| | - Bin Hu
- Department of Computer Science and Technology, Kean University, Union, NJ 07083, USA
| | - Xiong Wang
- School of Information Science and Engineering, Yunnan University, Kunming 650500, China
| |
Collapse
|
3
|
Mahindru A, Arora H, Kumar A, Gupta SK, Mahajan S, Kadry S, Kim J. PermDroid a framework developed using proposed feature selection approach and machine learning techniques for Android malware detection. Sci Rep 2024; 14:10724. [PMID: 38730228 DOI: 10.1038/s41598-024-60982-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2023] [Accepted: 04/29/2024] [Indexed: 05/12/2024] Open
Abstract
The challenge of developing an Android malware detection framework that can identify malware in real-world apps is difficult for academicians and researchers. The vulnerability lies in the permission model of Android. Therefore, it has attracted the attention of various researchers to develop an Android malware detection model using permission or a set of permissions. Academicians and researchers have used all extracted features in previous studies, resulting in overburdening while creating malware detection models. But, the effectiveness of the machine learning model depends on the relevant features, which help in reducing the value of misclassification errors and have excellent discriminative power. A feature selection framework is proposed in this research paper that helps in selecting the relevant features. In the first stage of the proposed framework, t-test, and univariate logistic regression are implemented on our collected feature data set to classify their capacity for detecting malware. Multivariate linear regression stepwise forward selection and correlation analysis are implemented in the second stage to evaluate the correctness of the features selected in the first stage. Furthermore, the resulting features are used as input in the development of malware detection models using three ensemble methods and a neural network with six different machine-learning algorithms. The developed models' performance is compared using two performance parameters: F-measure and Accuracy. The experiment is performed by using half a million different Android apps. The empirical findings reveal that malware detection model developed using features selected by implementing proposed feature selection framework achieved higher detection rate as compared to the model developed using all extracted features data set. Further, when compared to previously developed frameworks or methodologies, the experimental results indicates that model developed in this study achieved an accuracy of 98.8%.
Collapse
Affiliation(s)
- Arvind Mahindru
- Department of Computer Science and applications, D.A.V. University, Sarmastpur, Jalandhar, 144012, India.
| | - Himani Arora
- Department of Mathematics, Guru Nanak Dev University, Amritsar, India
| | - Abhinav Kumar
- Department of Nuclear and Renewable Energy, Ural Federal University Named after the First President of Russia Boris Yeltsin, Ekaterinburg, Russia, 620002
| | - Sachin Kumar Gupta
- Department of Electronics and Communication Engineering, Central University of Jammu, Jammu, 181143, UT of J&K, India.
- School of Electronics and Communication Engineering, Shri Mata Vaishno Devi University, Katra, 182320, UT of J&K, India.
| | - Shubham Mahajan
- Department of Applied Data Science, Noroff University College, Kristiansand, Norway.
| | - Seifedine Kadry
- Department of Applied Data Science, Noroff University College, Kristiansand, Norway
- Artificial Intelligence Research Center (AIRC), Ajman University, Ajman, 346, United Arab Emirates
- MEU Research Unit, Middle East University, Amman 11831, Jordan
- Applied Science Research Center, Applied Science Private University, Amman, Jordan
| | - Jungeun Kim
- Department of Software, Department of Computer Science and Engineering, Kongju National University, Cheonan, 31080, Korea.
| |
Collapse
|
4
|
Zhang L, Chen X. Enhanced chimp hierarchy optimization algorithm with adaptive lens imaging for feature selection in data classification. Sci Rep 2024; 14:6910. [PMID: 38519568 PMCID: PMC10959962 DOI: 10.1038/s41598-024-57518-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Accepted: 03/19/2024] [Indexed: 03/25/2024] Open
Abstract
Feature selection is a critical component of machine learning and data mining to remove redundant and irrelevant features from a dataset. The Chimp Optimization Algorithm (CHoA) is widely applicable to various optimization problems due to its low number of parameters and fast convergence rate. However, CHoA has a weak exploration capability and tends to fall into local optimal solutions in solving the feature selection process, leading to ineffective removal of irrelevant and redundant features. To solve this problem, this paper proposes the Enhanced Chimp Hierarchy Optimization Algorithm for adaptive lens imaging (ALI-CHoASH) for searching the optimal classification problems for the optimal subset of features. Specifically, to enhance the exploration and exploitation capability of CHoA, we designed a chimp social hierarchy. We employed a novel social class factor to label the class situation of each chimp, enabling effective modelling and optimization of the relationships among chimp individuals. Then, to parse chimps' social and collaborative behaviours with different social classes, we introduce other attacking prey and autonomous search strategies to help chimp individuals approach the optimal solution faster. In addition, considering the poor diversity of chimp groups in the late iteration, we propose an adaptive lens imaging back-learning strategy to avoid the algorithm falling into a local optimum. Finally, we validate the improvement of ALI-CHoASH in exploration and exploitation capabilities using several high-dimensional datasets. We also compare ALI-CHoASH with eight state-of-the-art methods in classification accuracy, feature subset size, and computation time to demonstrate its superiority.
Collapse
Affiliation(s)
- Li Zhang
- College of Computer Engineering, Jiangsu University of Technology, Changzhou, 213001, People's Republic of China.
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education Jilin University, Changchun, 130012, People's Republic of China.
| | - XiaoBo Chen
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education Jilin University, Changchun, 130012, People's Republic of China
- People's Bank of China Changzhou City Center Branch, Changzhou, 213001, Jiangsu, People's Republic of China
| |
Collapse
|
5
|
Aragones DG, Palomino-Segura M, Sicilia J, Crainiciuc G, Ballesteros I, Sánchez-Cabo F, Hidalgo A, Calvo GF. Variable selection for nonlinear dimensionality reduction of biological datasets through bootstrapping of correlation networks. Comput Biol Med 2024; 168:107827. [PMID: 38086138 DOI: 10.1016/j.compbiomed.2023.107827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 11/15/2023] [Accepted: 12/04/2023] [Indexed: 01/10/2024]
Abstract
Identifying the most relevant variables or features in massive datasets for dimensionality reduction can lead to improved and more informative display, faster computation times, and more explainable models of complex systems. Despite significant advances and available algorithms, this task generally remains challenging, especially in unsupervised settings. In this work, we propose a method that constructs correlation networks using all intervening variables and then selects the most informative ones based on network bootstrapping. The method can be applied in both supervised and unsupervised scenarios. We demonstrate its functionality by applying Uniform Manifold Approximation and Projection for dimensionality reduction to several high-dimensional biological datasets, derived from 4D live imaging recordings of hundreds of morpho-kinetic variables, describing the dynamics of thousands of individual leukocytes at sites of prominent inflammation. We compare our method with other standard ones in the field, such as Principal Component Analysis and Elastic Net, showing that it outperforms them. The proposed method can be employed in a wide range of applications, encompassing data analysis and machine learning.
Collapse
Affiliation(s)
- David G Aragones
- Department of Mathematics & MOLAB-Mathematical Oncology Laboratory, Universidad de Castilla-La Mancha, Ciudad Real, Spain
| | - Miguel Palomino-Segura
- Area of Cell and Developmental Biology, Centro Nacional de Investigaciones Cardiovasculares Carlos III, Madrid, Spain; Immunophysiology Research Group, Instituto Universitario de Investigación Biosanitaria de Extremadura (INUBE), Badajoz, Spain; Department of Physiology, Faculty of Sciences, University of Extremadura, Badajoz, Spain
| | - Jon Sicilia
- Area of Cell and Developmental Biology, Centro Nacional de Investigaciones Cardiovasculares Carlos III, Madrid, Spain
| | - Georgiana Crainiciuc
- Area of Cell and Developmental Biology, Centro Nacional de Investigaciones Cardiovasculares Carlos III, Madrid, Spain
| | - Iván Ballesteros
- Area of Cell and Developmental Biology, Centro Nacional de Investigaciones Cardiovasculares Carlos III, Madrid, Spain
| | - Fátima Sánchez-Cabo
- Bioinformatics Unit, Centro Nacional de Investigaciones Cardiovasculares Carlos III, Madrid, Spain
| | - Andrés Hidalgo
- Vascular Biology and Therapeutics Program and Department of Immunobiology, Yale University School of Medicine, New Haven, CT, USA
| | - Gabriel F Calvo
- Department of Mathematics & MOLAB-Mathematical Oncology Laboratory, Universidad de Castilla-La Mancha, Ciudad Real, Spain.
| |
Collapse
|
6
|
Yu X, Qin W, Lin X, Shan Z, Huang L, Shao Q, Wang L, Chen M. Synergizing the enhanced RIME with fuzzy K-nearest neighbor for diagnose of pulmonary hypertension. Comput Biol Med 2023; 165:107408. [PMID: 37672924 DOI: 10.1016/j.compbiomed.2023.107408] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 08/19/2023] [Accepted: 08/27/2023] [Indexed: 09/08/2023]
Abstract
Pulmonary hypertension (PH) is an uncommon yet severe condition characterized by sustained elevation of blood pressure in the pulmonary arteries. The delaying treatment can result in disease progression, right ventricular failure, increased risk of complications, and even death. Early recognition and timely treatment are crucial in halting PH progression, improving cardiac function, and reducing complications. Within this study, we present a highly promising hybrid model, known as bERIME_FKNN, which constitutes a feature selection approach integrating the enhanced rime algorithm (ERIME) and fuzzy K-nearest neighbor (FKNN) technique. The ERIME introduces the triangular game search strategy, which augments the algorithm's capacity for global exploration by judiciously electing distinct search agents across the exploratory domain. This approach fosters both competitive rivalry and collaborative synergy among these agents. Moreover, an random follower search strategy is incorporated to bestow a novel trajectory upon the principal search agent, thereby enriching the spectrum of search directions. Initially, ERIME is meticulously compared to 11 state-of-the-art algorithms using the IEEE CEC2017 benchmark functions across diverse dimensionalities such as 10, 30, 50, and 100, ultimately validating its exceptional optimization capability within the model. Subsequently, employing the color moment and grayscale co-occurrence matrix methodologies, a total of 118 features are extracted from 63 PH patients' and 60 healthy individuals' images, alongside an analysis of 14,514 recordings obtained from these patients utilizing the developed bERIME_FKNN model. The outcomes manifest that the bERIME_FKNN model exhibits a conspicuous prowess in the realm of PH classification, attaining an accuracy and specificity exceeding 99%. This implies that the model serves as a valuable computer-aided tool, delivering an advanced warning system for diagnosis and prognosis evaluation of PH.
Collapse
Affiliation(s)
- Xiaoming Yu
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000, Zhejiang, China.
| | - Wenxiang Qin
- The First School of Medicine, School of Information and Engineering, Wenzhou Medical University, Wenzhou, 325000, Zhejiang, China.
| | - Xiao Lin
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000, Zhejiang, China.
| | - Zhuohan Shan
- The First School of Medicine, School of Information and Engineering, Wenzhou Medical University, Wenzhou, 325000, Zhejiang, China.
| | - Liyao Huang
- Key Laboratory of Intelligent Informatics for Safety & Emergency of Zhejiang Province, Wenzhou University, Wenzhou, 325035, China.
| | - Qike Shao
- Key Laboratory of Intelligent Informatics for Safety & Emergency of Zhejiang Province, Wenzhou University, Wenzhou, 325035, China.
| | - Liangxing Wang
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000, Zhejiang, China.
| | - Mayun Chen
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000, Zhejiang, China.
| |
Collapse
|
7
|
Jia L, Wang T, Gad AG, Salem A. A weighted-sum chaotic sparrow search algorithm for interdisciplinary feature selection and data classification. Sci Rep 2023; 13:14061. [PMID: 37640716 PMCID: PMC10462760 DOI: 10.1038/s41598-023-38252-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Accepted: 07/05/2023] [Indexed: 08/31/2023] Open
Abstract
In today's data-driven digital culture, there is a critical demand for optimized solutions that essentially reduce operating expenses while attempting to increase productivity. The amount of memory and processing time that can be used to process enormous volumes of data are subject to a number of limitations. This would undoubtedly be more of a problem if a dataset contained redundant and uninteresting information. For instance, many datasets contain a number of non-informative features that primarily deceive a given classification algorithm. In order to tackle this, researchers have been developing a variety of feature selection (FS) techniques that aim to eliminate unnecessary information from the raw datasets before putting them in front of a machine learning (ML) algorithm. Meta-heuristic optimization algorithms are often a solid choice to solve NP-hard problems like FS. In this study, we present a wrapper FS technique based on the sparrow search algorithm (SSA), a type of meta-heuristic. SSA is a swarm intelligence (SI) method that stands out because of its quick convergence and improved stability. SSA does have some drawbacks, like lower swarm diversity and weak exploration ability in late iterations, like the majority of SI algorithms. So, using ten chaotic maps, we try to ameliorate SSA in three ways: (i) the initial swarm generation; (ii) the substitution of two random variables in SSA; and (iii) clamping the sparrows crossing the search range. As a result, we get CSSA, a chaotic form of SSA. Extensive comparisons show CSSA to be superior in terms of swarm diversity and convergence speed in solving various representative functions from the Institute of Electrical and Electronics Engineers (IEEE) Congress on Evolutionary Computation (CEC) benchmark set. Furthermore, experimental analysis of CSSA on eighteen interdisciplinary, multi-scale ML datasets from the University of California Irvine (UCI) data repository, as well as three high-dimensional microarray datasets, demonstrates that CSSA outperforms twelve state-of-the-art algorithms in a classification task based on FS discipline. Finally, a 5%-significance-level statistical post-hoc analysis based on Wilcoxon's signed-rank test, Friedman's rank test, and Nemenyi's test confirms CSSA's significance in terms of overall fitness, classification accuracy, selected feature size, computational time, convergence trace, and stability.
Collapse
Affiliation(s)
- LiYun Jia
- Department of Mathematics and Physics, Hebei University of Architecture, Zhangjiakou, 075000, China
| | - Tao Wang
- Department of Mathematics and Physics, Hebei University of Architecture, Zhangjiakou, 075000, China
| | - Ahmed G Gad
- Faculty of Computers and Information, Kafrelsheikh University, Kafrelsheikh, 33516, Egypt.
| | - Ahmed Salem
- College of Computing and Information Technology, Arab Academy for Science, Technology and Maritime Transport (AASTMT), Cairo, Egypt
| |
Collapse
|
8
|
Rai R, Dhal KG. Recent Developments in Equilibrium Optimizer Algorithm: Its Variants and Applications. ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING : STATE OF THE ART REVIEWS 2023; 30:1-54. [PMID: 37359743 PMCID: PMC10096115 DOI: 10.1007/s11831-023-09923-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 03/26/2023] [Indexed: 06/28/2023]
Abstract
There have been many algorithms created and introduced in the literature inspired by various events observable in nature, such as evolutionary phenomena, the actions of social creatures or agents, broad principles based on physical processes, the nature of chemical reactions, human behavior, superiority, and intelligence, intelligent behavior of plants, numerical techniques and mathematics programming procedure and its orientation. Nature-inspired metaheuristic algorithms have dominated the scientific literature and have become a widely used computing paradigm over the past two decades. Equilibrium Optimizer, popularly known as EO, is a population-based, nature-inspired meta-heuristics that belongs to the class of Physics based optimization algorithms, enthused by dynamic source and sink models with a physics foundation that are used to make educated guesses about equilibrium states. EO has achieved massive recognition, and there are quite a few changes made to existing EOs. This article gives a thorough review of EO and its variations. We started with 175 research articles published by several major publishers. Additionally, we discuss the strengths and weaknesses of the algorithms to help researchers find the variant that best suits their needs. The core optimization problems from numerous application areas using EO are also covered in the study, including image classification, scheduling problems, and many others. Lastly, this work recommends a few potential areas for EO research in the future.
Collapse
Affiliation(s)
- Rebika Rai
- Department of Computer Applications, Sikkim University, Sikkim, India
| | - Krishna Gopal Dhal
- Department of Computer Science and Application, Midnapore College (Autonomous), Paschim Medinipur, Midnapore, West Bengal India
| |
Collapse
|