1
|
Shin H, Oh S. An effective heuristic for developing hybrid feature selection in high dimensional and low sample size datasets. BMC Bioinformatics 2024; 25:390. [PMID: 39722052 DOI: 10.1186/s12859-024-06017-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2024] [Accepted: 12/17/2024] [Indexed: 12/28/2024] Open
Abstract
BACKGROUND High-dimensional datasets with low sample sizes (HDLSS) are pivotal in the fields of biology and bioinformatics. One of core objective of HDLSS is to select most informative features and discarding redundant or irrelevant features. This is particularly crucial in bioinformatics, where accurate feature (gene) selection can lead to breakthroughs in drug development and provide insights into disease diagnostics. Despite its importance, identifying optimal features is still a significant challenge in HDLSS. RESULTS To address this challenge, we propose an effective feature selection method that combines gradual permutation filtering with a heuristic tribrid search strategy, specifically tailored for HDLSS contexts. The proposed method considers inter-feature interactions and leverages feature rankings during the search process. In addition, a new performance metric for the HDLSS that evaluates both the number and quality of selected features is suggested. Through the comparison of the benchmark dataset with existing methods, the proposed method reduced the average number of selected features from 37.8 to 5.5 and improved the performance of the prediction model, based on the selected features, from 0.855 to 0.927. CONCLUSIONS The proposed method effectively selects a small number of important features and achieves high prediction performance.
Collapse
Affiliation(s)
- Hyunseok Shin
- Department of Computer Science, Dankook University, Youngin, Gyeonggi, South Korea
| | - Sejong Oh
- Department of Software Science, Dankook University, Youngin, Gyeonggi, South Korea.
| |
Collapse
|
2
|
Chen T, Yi Y. Multi-Strategy Enhanced Parrot Optimizer: Global Optimization and Feature Selection. Biomimetics (Basel) 2024; 9:662. [PMID: 39590234 PMCID: PMC11591862 DOI: 10.3390/biomimetics9110662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2024] [Revised: 10/24/2024] [Accepted: 10/30/2024] [Indexed: 11/28/2024] Open
Abstract
Optimization algorithms are pivotal in addressing complex problems across diverse domains, including global optimization and feature selection (FS). In this paper, we introduce the Enhanced Crisscross Parrot Optimizer (ECPO), an improved version of the Parrot Optimizer (PO), designed to address these challenges effectively. The ECPO incorporates a sophisticated strategy selection mechanism that allows individuals to retain successful behaviors from prior iterations and shift to alternative strategies in case of update failures. Additionally, the integration of a crisscross (CC) mechanism promotes more effective information exchange among individuals, enhancing the algorithm's exploration capabilities. The proposed algorithm's performance is evaluated through extensive experiments on the CEC2017 benchmark functions, where it is compared with ten other conventional optimization algorithms. Results demonstrate that the ECPO consistently outperforms these algorithms across various fitness landscapes. Furthermore, a binary version of the ECPO is developed and applied to FS problems on ten real-world datasets, demonstrating its ability to achieve competitive error rates with reduced feature subsets. These findings suggest that the ECPO holds promise as an effective approach for both global optimization and feature selection.
Collapse
Affiliation(s)
| | - Yuanyuan Yi
- College of Geophysics and Petroleum Resources, Yangtze University, Wuhan 430100, China;
| |
Collapse
|
3
|
Gil-Rios MA, Cruz-Aceves I, Hernandez-Aguirre A, Hernandez-Gonzalez MA, Solorio-Meza SE. Improving Automatic Coronary Stenosis Classification Using a Hybrid Metaheuristic with Diversity Control. Diagnostics (Basel) 2024; 14:2372. [PMID: 39518340 PMCID: PMC11545375 DOI: 10.3390/diagnostics14212372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2024] [Revised: 10/17/2024] [Accepted: 10/18/2024] [Indexed: 11/16/2024] Open
Abstract
This study proposes a novel Hybrid Metaheuristic with explicit diversity control, aimed at finding an optimal feature subset by thoroughly exploring the search space to prevent premature convergence. Background/Objectives: Unlike traditional evolutionary computing techniques, which only consider the best individuals in a population, the proposed strategy also considers the worst individuals under certain conditions. In consequence, feature selection frequencies tend to be more uniform, decreasing the probability of premature convergent results and local-optima solutions. Methods: An image database containing 608 images, evenly balanced between positive and negative coronary stenosis cases, was used for experiments. A total of 473 features, including intensity, texture, and morphological types, were extracted from the image bank. A Support Vector Machine was employed to classify positive and negative stenosis cases, with Accuracy and the Jaccard Coefficient used as performance metrics. Results: The proposed strategy achieved a classification rate of 0.92 for Accuracy and 0.85 for the Jaccard Coefficient, obtaining a subset of 16 features, which represents a discrimination rate of 0.97 from the 473 initial features. Conclusions: The Hybrid Metaheuristic with explicit diversity control improved the classification performance of coronary stenosis cases compared to previous literature. Based on the achieved results, the identified feature subset demonstrates potential for use in clinical practice, particularly in decision-support information systems.
Collapse
Affiliation(s)
- Miguel-Angel Gil-Rios
- Universidad Área Académica de Tecnologías de la Información, Universidad Tecnológica de León, Blvd. Universidad Tecnológica 225, Col. San Carlos, León 37670, Mexico;
| | - Ivan Cruz-Aceves
- CONAHCYT, Consejo Nacional de Humanidades, Ciencia y Tecnología (CONAHCYT), Centro de Investigación en Matemáticas (CIMAT), A.C., Jalisco S/N, Col. Valenciana, Guanajuato 36000, Mexico
| | - Arturo Hernandez-Aguirre
- Centro de Investigación en Matemáticas (CIMAT), A.C., Jalisco S/N, Col. Valenciana, Guanajuato 36000, Mexico;
| | - Martha-Alicia Hernandez-Gonzalez
- Unidad Médica de Alta Especialidad (UMAE), Hospital de Especialidades No.1. Centro Médico Nacional del Bajio, Instituto Mexicano del Seguro Social (IMSS), Blvd. Adolfo López Mateos S/N, León 37150, Mexico;
| | - Sergio-Eduardo Solorio-Meza
- División de Ciencias e Ingenierías, Universidad de Guanajuato, Campus León, Loma del Bosque 103, Col. Lomas del Campestre, León 37150, Mexico;
| |
Collapse
|
4
|
Khan A, Zubair S, Shuaib M, Sheneamer A, Alam S, Assiri B. Development of a robust parallel and multi-composite machine learning model for improved diagnosis of Alzheimer's disease: correlation with dementia-associated drug usage and AT(N) protein biomarkers. Front Neurosci 2024; 18:1391465. [PMID: 39308946 PMCID: PMC11412962 DOI: 10.3389/fnins.2024.1391465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2024] [Accepted: 08/12/2024] [Indexed: 09/25/2024] Open
Abstract
Introduction Machine learning (ML) algorithms and statistical modeling offer a potential solution to offset the challenge of diagnosing early Alzheimer's disease (AD) by leveraging multiple data sources and combining information on neuropsychological, genetic, and biomarker indicators. Among others, statistical models are a promising tool to enhance the clinical detection of early AD. In the present study, early AD was diagnosed by taking into account characteristics related to whether or not a patient was taking specific drugs and a significant protein as a predictor of Amyloid-Beta (Aβ), tau, and ptau [AT(N)] levels among participants. Methods In this study, the optimization of predictive models for the diagnosis of AD pathologies was carried out using a set of baseline features. The model performance was improved by incorporating additional variables associated with patient drugs and protein biomarkers into the model. The diagnostic group consisted of five categories (cognitively normal, significant subjective memory concern, early mildly cognitively impaired, late mildly cognitively impaired, and AD), resulting in a multinomial classification challenge. In particular, we examined the relationship between AD diagnosis and the use of various drugs (calcium and vitamin D supplements, blood-thinning drugs, cholesterol-lowering drugs, and cognitive drugs). We propose a hybrid-clinical model that runs multiple ML models in parallel and then takes the majority's votes, enhancing the accuracy. We also assessed the significance of three cerebrospinal fluid biomarkers, Aβ, tau, and ptau in the diagnosis of AD. We proposed that a hybrid-clinical model be used to simulate the MRI-based data, with five diagnostic groups of individuals, with further refinement that includes preclinical characteristics of the disorder. The proposed design builds a Meta-Model for four different sets of criteria. The set criteria are as follows: to diagnose from baseline features, baseline and drug features, baseline and protein features, and baseline, drug and protein features. Results We were able to attain a maximum accuracy of 97.60% for baseline and protein data. We observed that the constructed model functioned effectively when all five drugs were included and when any single drug was used to diagnose the response variable. Interestingly, the constructed Meta-Model worked well when all three protein biomarkers were included, as well as when a single protein biomarker was utilized to diagnose the response variable. Discussion It is noteworthy that we aimed to construct a pipeline design that incorporates comprehensive methodologies to detect Alzheimer's over wide-ranging input values and variables in the current study. Thus, the model that we developed could be used by clinicians and medical experts to advance Alzheimer's diagnosis and as a starting point for future research into AD and other neurodegenerative syndromes.
Collapse
Affiliation(s)
- Afreen Khan
- Department of Computer Application, Faculty of Engineering & IT, Integral University, Lucknow, India
| | - Swaleha Zubair
- Department of Computer Science, Faculty of Science, Aligarh Muslim University, Aligarh, India
| | - Mohammed Shuaib
- Department of Computer Science, College of Engineering and Computer Science, Jazan University, Jazan, Saudi Arabia
| | - Abdullah Sheneamer
- Department of Computer Science, College of Engineering and Computer Science, Jazan University, Jazan, Saudi Arabia
| | - Shadab Alam
- Department of Computer Science, College of Engineering and Computer Science, Jazan University, Jazan, Saudi Arabia
| | - Basem Assiri
- Department of Computer Science, College of Engineering and Computer Science, Jazan University, Jazan, Saudi Arabia
| |
Collapse
|
5
|
Chen W, Yang H, Yin L, Luo X. Large-scale IoT attack detection scheme based on LightGBM and feature selection using an improved salp swarm algorithm. Sci Rep 2024; 14:19165. [PMID: 39160210 PMCID: PMC11333491 DOI: 10.1038/s41598-024-69968-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Accepted: 08/12/2024] [Indexed: 08/21/2024] Open
Abstract
Due to the swift advancement of the Internet of Things (IoT), there has been a significant surge in the quantity of interconnected IoT devices that send and exchange vital data across the network. Nevertheless, the frequency of attacks on the Internet of Things is steadily rising, posing a persistent risk to the security and privacy of IoT data. Therefore, it is crucial to develop a highly efficient method for detecting cyber threats on the Internet of Things. Nevertheless, several current network attack detection schemes encounter issues such as insufficient detection accuracy, the curse of dimensionality due to excessively high data dimensions, and the sluggish efficiency of complex models. Employing metaheuristic algorithms for feature selection in network data represents an effective strategy among the myriad of solutions. This study introduces a more comprehensive metaheuristic algorithm called GQBWSSA, which is an enhanced version of the Salp Swarm Algorithm with several strategy improvements. Utilizing this algorithm, a threshold voting-based feature selection framework is designed to obtain an optimized set of features. This procedure efficiently decreases the number of dimensions in the data, hence preventing the negative effects of having a high number of dimensions and effectively extracting the most significant and crucial information. Subsequently, the extracted feature data is combined with the LightGBM algorithm to form a lightweight and efficient ensemble learning scheme for IoT attack detection. The proposed enhanced metaheuristic algorithm has superior performance in feature selection compared to the recent metaheuristic algorithms, as evidenced by the experimental evaluation conducted using the NSLKDD and CICIoT2023 datasets. Compared to current popular ensemble learning solutions, the proposed overall solution exhibits excellent performance on multiple key indicators, including accuracy, precision, as well as training and detection time. Especially on the large-scale dataset CICIoT2023, the proposed scheme achieves an accuracy rate of 99.70% in binary classification and 99.41% in multi classification.
Collapse
Affiliation(s)
- Weizhe Chen
- Cyberspace Institute of Advanced Technology, Guangzhou University, Guangzhou, 510006, China
| | - Hongyu Yang
- Cyberspace Institute of Advanced Technology, Guangzhou University, Guangzhou, 510006, China
| | - Lihua Yin
- Cyberspace Institute of Advanced Technology, Guangzhou University, Guangzhou, 510006, China.
| | - Xi Luo
- Cyberspace Institute of Advanced Technology, Guangzhou University, Guangzhou, 510006, China
| |
Collapse
|
6
|
Geng Y, Li Y, Deng C. An Improved Binary Walrus Optimizer with Golden Sine Disturbance and Population Regeneration Mechanism to Solve Feature Selection Problems. Biomimetics (Basel) 2024; 9:501. [PMID: 39194480 DOI: 10.3390/biomimetics9080501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2024] [Revised: 08/13/2024] [Accepted: 08/14/2024] [Indexed: 08/29/2024] Open
Abstract
Feature selection (FS) is a significant dimensionality reduction technique in machine learning and data mining that is adept at managing high-dimensional data efficiently and enhancing model performance. Metaheuristic algorithms have become one of the most promising solutions in FS owing to their powerful search capabilities as well as their performance. In this paper, the novel improved binary walrus optimizer (WO) algorithm utilizing the golden sine strategy, elite opposition-based learning (EOBL), and population regeneration mechanism (BGEPWO) is proposed for FS. First, the population is initialized using an iterative chaotic map with infinite collapses (ICMIC) chaotic map to improve the diversity. Second, a safe signal is obtained by introducing an adaptive operator to enhance the stability of the WO and optimize the trade-off between exploration and exploitation of the algorithm. Third, BGEPWO innovatively designs a population regeneration mechanism to continuously eliminate hopeless individuals and generate new promising ones, which keeps the population moving toward the optimal solution and accelerates the convergence process. Fourth, EOBL is used to guide the escape behavior of the walrus to expand the search range. Finally, the golden sine strategy is utilized for perturbing the population in the late iteration to improve the algorithm's capacity to evade local optima. The BGEPWO algorithm underwent evaluation on 21 datasets of different sizes and was compared with the BWO algorithm and 10 other representative optimization algorithms. The experimental results demonstrate that BGEPWO outperforms these competing algorithms in terms of fitness value, number of selected features, and F1-score in most datasets. The proposed algorithm achieves higher accuracy, better feature reduction ability, and stronger convergence by increasing population diversity, continuously balancing exploration and exploitation processes and effectively escaping local optimal traps.
Collapse
Affiliation(s)
- Yanyu Geng
- College of Computer Science and Technology, Jilin University, Changchun 130012, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China
| | - Ying Li
- College of Computer Science and Technology, Jilin University, Changchun 130012, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China
| | - Chunyan Deng
- College of Computer Science and Technology, Jilin University, Changchun 130012, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China
| |
Collapse
|
7
|
Qiu F, Heidari AA, Chen Y, Chen H, Liang G. Advancing forensic-based investigation incorporating slime mould search for gene selection of high-dimensional genetic data. Sci Rep 2024; 14:8599. [PMID: 38615048 PMCID: PMC11016116 DOI: 10.1038/s41598-024-59064-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Accepted: 04/06/2024] [Indexed: 04/15/2024] Open
Abstract
Modern medicine has produced large genetic datasets of high dimensions through advanced gene sequencing technology, and processing these data is of great significance for clinical decision-making. Gene selection (GS) is an important data preprocessing technique that aims to select a subset of feature information to improve performance and reduce data dimensionality. This study proposes an improved wrapper GS method based on forensic-based investigation (FBI). The method introduces the search mechanism of the slime mould algorithm in the FBI to improve the original FBI; the newly proposed algorithm is named SMA_FBI; then GS is performed by converting the continuous optimizer to a binary version of the optimizer through a transfer function. In order to verify the superiority of SMA_FBI, experiments are first executed on the 30-function test set of CEC2017 and compared with 10 original algorithms and 10 state-of-the-art algorithms. The experimental results show that SMA_FBI is better than other algorithms in terms of finding the optimal solution, convergence speed, and robustness. In addition, BSMA_FBI (binary version of SMA_FBI) is compared with 8 binary algorithms on 18 high-dimensional genetic data from the UCI repository. The results indicate that BSMA_FBI is able to obtain high classification accuracy with fewer features selected in GS applications. Therefore, SMA_FBI is considered an optimization tool with great potential for dealing with global optimization problems, and its binary version, BSMA_FBI, can be used for GS tasks.
Collapse
Affiliation(s)
- Feng Qiu
- Institute of Big Data and Information Technology, Wenzhou University, Wenzhou, 325035, China
| | - Ali Asghar Heidari
- School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran, Tehran, Iran
| | - Yi Chen
- Department of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, 325035, China
| | - Huiling Chen
- Institute of Big Data and Information Technology, Wenzhou University, Wenzhou, 325035, China.
| | - Guoxi Liang
- Department of Artificial Intelligence, Wenzhou Polytechnic, Wenzhou, 325035, China.
| |
Collapse
|
8
|
Zhang Y, Liu B, Bunting KV, Brind D, Thorley A, Karwath A, Lu W, Zhou D, Wang X, Mobley AR, Tica O, Gkoutos GV, Kotecha D, Duan J. Development of automated neural network prediction for echocardiographic left ventricular ejection fraction. Front Med (Lausanne) 2024; 11:1354070. [PMID: 38686369 PMCID: PMC11057494 DOI: 10.3389/fmed.2024.1354070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Accepted: 03/18/2024] [Indexed: 05/02/2024] Open
Abstract
Introduction The echocardiographic measurement of left ventricular ejection fraction (LVEF) is fundamental to the diagnosis and classification of patients with heart failure (HF). Methods This paper aimed to quantify LVEF automatically and accurately with the proposed pipeline method based on deep neural networks and ensemble learning. Within the pipeline, an Atrous Convolutional Neural Network (ACNN) was first trained to segment the left ventricle (LV), before employing the area-length formulation based on the ellipsoid single-plane model to calculate LVEF values. This formulation required inputs of LV area, derived from segmentation using an improved Jeffrey's method, as well as LV length, derived from a novel ensemble learning model. To further improve the pipeline's accuracy, an automated peak detection algorithm was used to identify end-diastolic and end-systolic frames, avoiding issues with human error. Subsequently, single-beat LVEF values were averaged across all cardiac cycles to obtain the final LVEF. Results This method was developed and internally validated in an open-source dataset containing 10,030 echocardiograms. The Pearson's correlation coefficient was 0.83 for LVEF prediction compared to expert human analysis (p < 0.001), with a subsequent area under the receiver operator curve (AUROC) of 0.98 (95% confidence interval 0.97 to 0.99) for categorisation of HF with reduced ejection (HFrEF; LVEF<40%). In an external dataset with 200 echocardiograms, this method achieved an AUC of 0.90 (95% confidence interval 0.88 to 0.91) for HFrEF assessment. Conclusion The automated neural network-based calculation of LVEF is comparable to expert clinicians performing time-consuming, frame-by-frame manual evaluations of cardiac systolic function.
Collapse
Affiliation(s)
- Yuting Zhang
- School of Computer Science, University of Birmingham, Edgbaston, Birmingham, United Kingdom
| | - Boyang Liu
- Manchester University NHS Foundation Trust, Manchester, United Kingdom
| | - Karina V. Bunting
- Institute of Cardiovascular Sciences, University of Birmingham, Edgbaston, Birmingham, United Kingdom
- NIHR Birmingham Biomedical Research Centre and West Midlands NHS Secure Data Environment, University Hospitals Birmingham NHS Foundation Trust, Birmingham, United Kingdom
| | - David Brind
- Institute of Cancer and Genomic Sciences, University of Birmingham, Edgbaston, Birmingham, United Kingdom
- Health Data Research UK Midlands, University Hospitals Birmingham NHS Foundation Trust, Birmingham, United Kingdom
- Centre for Health Data Science, University of Birmingham, Edgbaston, Birmingham, United Kingdom
| | - Alexander Thorley
- School of Computer Science, University of Birmingham, Edgbaston, Birmingham, United Kingdom
| | - Andreas Karwath
- Institute of Cancer and Genomic Sciences, University of Birmingham, Edgbaston, Birmingham, United Kingdom
- Centre for Health Data Science, University of Birmingham, Edgbaston, Birmingham, United Kingdom
| | - Wenqi Lu
- Department of Computing and Mathematics, Manchester Metropolitan University, Manchester, United Kingdom
| | - Diwei Zhou
- Department of Mathematical Sciences, Loughborough University, Loughborough, United Kingdom
| | - Xiaoxia Wang
- Institute of Cardiovascular Sciences, University of Birmingham, Edgbaston, Birmingham, United Kingdom
- NIHR Birmingham Biomedical Research Centre and West Midlands NHS Secure Data Environment, University Hospitals Birmingham NHS Foundation Trust, Birmingham, United Kingdom
- Health Data Research UK Midlands, University Hospitals Birmingham NHS Foundation Trust, Birmingham, United Kingdom
| | - Alastair R. Mobley
- Institute of Cardiovascular Sciences, University of Birmingham, Edgbaston, Birmingham, United Kingdom
- NIHR Birmingham Biomedical Research Centre and West Midlands NHS Secure Data Environment, University Hospitals Birmingham NHS Foundation Trust, Birmingham, United Kingdom
| | - Otilia Tica
- Institute of Cardiovascular Sciences, University of Birmingham, Edgbaston, Birmingham, United Kingdom
| | - Georgios V. Gkoutos
- Institute of Cancer and Genomic Sciences, University of Birmingham, Edgbaston, Birmingham, United Kingdom
- Health Data Research UK Midlands, University Hospitals Birmingham NHS Foundation Trust, Birmingham, United Kingdom
- Centre for Health Data Science, University of Birmingham, Edgbaston, Birmingham, United Kingdom
| | - Dipak Kotecha
- Institute of Cardiovascular Sciences, University of Birmingham, Edgbaston, Birmingham, United Kingdom
- NIHR Birmingham Biomedical Research Centre and West Midlands NHS Secure Data Environment, University Hospitals Birmingham NHS Foundation Trust, Birmingham, United Kingdom
- Health Data Research UK Midlands, University Hospitals Birmingham NHS Foundation Trust, Birmingham, United Kingdom
| | - Jinming Duan
- School of Computer Science, University of Birmingham, Edgbaston, Birmingham, United Kingdom
| |
Collapse
|
9
|
Zhang L, Chen X. Enhanced chimp hierarchy optimization algorithm with adaptive lens imaging for feature selection in data classification. Sci Rep 2024; 14:6910. [PMID: 38519568 PMCID: PMC10959962 DOI: 10.1038/s41598-024-57518-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Accepted: 03/19/2024] [Indexed: 03/25/2024] Open
Abstract
Feature selection is a critical component of machine learning and data mining to remove redundant and irrelevant features from a dataset. The Chimp Optimization Algorithm (CHoA) is widely applicable to various optimization problems due to its low number of parameters and fast convergence rate. However, CHoA has a weak exploration capability and tends to fall into local optimal solutions in solving the feature selection process, leading to ineffective removal of irrelevant and redundant features. To solve this problem, this paper proposes the Enhanced Chimp Hierarchy Optimization Algorithm for adaptive lens imaging (ALI-CHoASH) for searching the optimal classification problems for the optimal subset of features. Specifically, to enhance the exploration and exploitation capability of CHoA, we designed a chimp social hierarchy. We employed a novel social class factor to label the class situation of each chimp, enabling effective modelling and optimization of the relationships among chimp individuals. Then, to parse chimps' social and collaborative behaviours with different social classes, we introduce other attacking prey and autonomous search strategies to help chimp individuals approach the optimal solution faster. In addition, considering the poor diversity of chimp groups in the late iteration, we propose an adaptive lens imaging back-learning strategy to avoid the algorithm falling into a local optimum. Finally, we validate the improvement of ALI-CHoASH in exploration and exploitation capabilities using several high-dimensional datasets. We also compare ALI-CHoASH with eight state-of-the-art methods in classification accuracy, feature subset size, and computation time to demonstrate its superiority.
Collapse
Affiliation(s)
- Li Zhang
- College of Computer Engineering, Jiangsu University of Technology, Changzhou, 213001, People's Republic of China.
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education Jilin University, Changchun, 130012, People's Republic of China.
| | - XiaoBo Chen
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education Jilin University, Changchun, 130012, People's Republic of China
- People's Bank of China Changzhou City Center Branch, Changzhou, 213001, Jiangsu, People's Republic of China
| |
Collapse
|
10
|
Li M, Luo Q, Zhou Y. BGOA-TVG: Binary Grasshopper Optimization Algorithm with Time-Varying Gaussian Transfer Functions for Feature Selection. Biomimetics (Basel) 2024; 9:187. [PMID: 38534872 DOI: 10.3390/biomimetics9030187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 03/09/2024] [Accepted: 03/11/2024] [Indexed: 03/28/2024] Open
Abstract
Feature selection aims to select crucial features to improve classification accuracy in machine learning and data mining. In this paper, a new binary grasshopper optimization algorithm using time-varying Gaussian transfer functions (BGOA-TVG) is proposed for feature selection. Compared with the traditional S-shaped and V-shaped transfer functions, the proposed Gaussian time-varying transfer functions have the characteristics of a fast convergence speed and a strong global search capability to convert a continuous search space to a binary one. The BGOA-TVG is tested and compared to S-shaped and V-shaped binary grasshopper optimization algorithms and five state-of-the-art swarm intelligence algorithms for feature selection. The experimental results show that the BGOA-TVG has better performance in UCI, DEAP, and EPILEPSY datasets for feature selection.
Collapse
Affiliation(s)
- Mengjun Li
- College of Artificial Intelligence, Guangxi Minzu University, Nanning 530006, China
| | - Qifang Luo
- College of Artificial Intelligence, Guangxi Minzu University, Nanning 530006, China
- Guangxi Key Laboratories of Hybrid Computation and IC Design Analysis, Nanning 530006, China
| | - Yongquan Zhou
- College of Artificial Intelligence, Guangxi Minzu University, Nanning 530006, China
- Guangxi Key Laboratories of Hybrid Computation and IC Design Analysis, Nanning 530006, China
- Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia
| |
Collapse
|
11
|
Lee J, Yoon Y, Kim J, Kim YH. Metaheuristic-Based Feature Selection Methods for Diagnosing Sarcopenia with Machine Learning Algorithms. Biomimetics (Basel) 2024; 9:179. [PMID: 38534863 DOI: 10.3390/biomimetics9030179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 03/01/2024] [Accepted: 03/13/2024] [Indexed: 03/28/2024] Open
Abstract
This study explores the efficacy of metaheuristic-based feature selection in improving machine learning performance for diagnosing sarcopenia. Extraction and utilization of features significantly impacting diagnosis efficacy emerge as a critical facet when applying machine learning for sarcopenia diagnosis. Using data from the 8th Korean Longitudinal Study on Aging (KLoSA), this study examines harmony search (HS) and the genetic algorithm (GA) for feature selection. Evaluation of the resulting feature set involves a decision tree, a random forest, a support vector machine, and naïve bayes algorithms. As a result, the HS-derived feature set trained with a support vector machine yielded an accuracy of 0.785 and a weighted F1 score of 0.782, which outperformed traditional methods. These findings underscore the competitive edge of metaheuristic-based selection, demonstrating its potential in advancing sarcopenia diagnosis. This study advocates for further exploration of metaheuristic-based feature selection's pivotal role in future sarcopenia research.
Collapse
Affiliation(s)
- Jaehyeong Lee
- Department of IT Convergence, Gachon University, 1342 Seongnamdaero, Sujeong-gu, Seongnam-si 13120, Gyeonggi-do, Republic of Korea
| | - Yourim Yoon
- Department of Computer Engineering, Gachon University, 1342 Seongnamdaero, Sujeong-gu, Seongnam-si 13120, Gyeonggi-do, Republic of Korea
| | - Jiyoun Kim
- Department of Exercise Rehabilitation, Gachon University, 191 Hambakmoe-ro, Yeonsu-gu, Incheon 21936, Republic of Korea
| | - Yong-Hyuk Kim
- School of Software, Kwangwoon University, 20 Kwangwoon-ro, Nowon-gu, Seoul 01897, Republic of Korea
| |
Collapse
|
12
|
Hafiz R, Saeed S. Hybrid whale algorithm with evolutionary strategies and filtering for high-dimensional optimization: Application to microarray cancer data. PLoS One 2024; 19:e0295643. [PMID: 38466740 PMCID: PMC10927076 DOI: 10.1371/journal.pone.0295643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 11/28/2023] [Indexed: 03/13/2024] Open
Abstract
The standard whale algorithm is prone to suboptimal results and inefficiencies in high-dimensional search spaces. Therefore, examining the whale optimization algorithm components is critical. The computer-generated initial populations often exhibit an uneven distribution in the solution space, leading to low diversity. We propose a fusion of this algorithm with a discrete recombinant evolutionary strategy to enhance initialization diversity. We conduct simulation experiments and compare the proposed algorithm with the original WOA on thirteen benchmark test functions. Simulation experiments on unimodal or multimodal benchmarks verified the better performance of the proposed RESHWOA, such as accuracy, minimum mean, and low standard deviation rate. Furthermore, we performed two data reduction techniques, Bhattacharya distance and signal-to-noise ratio. Support Vector Machine (SVM) excels in dealing with high-dimensional datasets and numerical features. When users optimize the parameters, they can significantly improve the SVM's performance, even though it already works well with its default settings. We applied RESHWOA and WOA methods on six microarray cancer datasets to optimize the SVM parameters. The exhaustive examination and detailed results demonstrate that the new structure has addressed WOA's main shortcomings. We conclude that the proposed RESHWOA performed significantly better than the WOA.
Collapse
Affiliation(s)
- Rahila Hafiz
- College of Statistical Sciences, University of the Punjab, Lahore, Pakistan
| | - Sana Saeed
- College of Statistical Sciences, University of the Punjab, Lahore, Pakistan
| |
Collapse
|
13
|
Nicolle A, Deng S, Ihme M, Kuzhagaliyeva N, Ibrahim EA, Farooq A. Mixtures Recomposition by Neural Nets: A Multidisciplinary Overview. J Chem Inf Model 2024; 64:597-620. [PMID: 38284618 DOI: 10.1021/acs.jcim.3c01633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2024]
Abstract
Artificial Neural Networks (ANNs) are transforming how we understand chemical mixtures, providing an expressive view of the chemical space and multiscale processes. Their hybridization with physical knowledge can bridge the gap between predictivity and understanding of the underlying processes. This overview explores recent progress in ANNs, particularly their potential in the 'recomposition' of chemical mixtures. Graph-based representations reveal patterns among mixture components, and deep learning models excel in capturing complexity and symmetries when compared to traditional Quantitative Structure-Property Relationship models. Key components, such as Hamiltonian networks and convolution operations, play a central role in representing multiscale mixtures. The integration of ANNs with Chemical Reaction Networks and Physics-Informed Neural Networks for inverse chemical kinetic problems is also examined. The combination of sensors with ANNs shows promise in optical and biomimetic applications. A common ground is identified in the context of statistical physics, where ANN-based methods iteratively adapt their models by blending their initial states with training data. The concept of mixture recomposition unveils a reciprocal inspiration between ANNs and reactive mixtures, highlighting learning behaviors influenced by the training environment.
Collapse
Affiliation(s)
- Andre Nicolle
- Aramco Fuel Research Center, Rueil-Malmaison 92852, France
| | - Sili Deng
- Massachusetts Institute of Technology, Cambridge 02139, Massachusetts, United States
| | - Matthias Ihme
- Stanford University, Stanford 94305, California, United States
| | | | - Emad Al Ibrahim
- King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia
| | - Aamir Farooq
- King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia
| |
Collapse
|
14
|
Yang G, Li W, Xie W, Wang L, Yu K. An improved binary particle swarm optimization algorithm for clinical cancer biomarker identification in microarray data. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 244:107987. [PMID: 38157825 DOI: 10.1016/j.cmpb.2023.107987] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 11/04/2023] [Accepted: 12/16/2023] [Indexed: 01/03/2024]
Abstract
BACKGROUND AND OBJECTIVE The limited number of samples and high-dimensional features in microarray data make selecting a small number of features for disease diagnosis a challenging problem. Traditional feature selection methods based on evolutionary algorithms are difficult to search for the optimal set of features in a limited time when dealing with the high-dimensional feature selection problem. New solutions are proposed to solve the above problems. METHODS In this paper, we propose a hybrid feature selection method (C-IFBPFE) for biomarker identification in microarray data, which combines clustering and improved binary particle swarm optimization while incorporating an embedded feature elimination strategy. Firstly, an adaptive redundant feature judgment method based on correlation clustering is proposed for feature screening to reduce the search space in the subsequent stage. Secondly, we propose an improved flipping probability-based binary particle swarm optimization (IFBPSO), better applicable to the binary particle swarm optimization problem. Finally, we also design a new feature elimination (FE) strategy embedded in the binary particle swarm optimization algorithm. This strategy gradually removes poorer features during iterations to reduce the number of features and improve accuracy. RESULTS We compared C-IFBPFE with other published hybrid feature selection methods on eight public datasets and analyzed the impact of each improvement. The proposed method outperforms other current state-of-the-art feature selection methods in terms of accuracy, number of features, sensitivity, and specificity. The ablation study of this method validates the efficacy of each component, especially the proposed feature elimination strategy significantly improves the performance of the algorithm. CONCLUSIONS The hybrid feature selection method proposed in this paper helps address the issue of high-dimensional microarray data with few samples. It can select a small subset of features and achieve high classification accuracy on microarray datasets. Additionally, independent validation of the selected features shows that those chosen by C-IFBPFE have strong correlations with disease phenotypes and can identify important biomarkers from data related to biomedical problems.
Collapse
Affiliation(s)
- Guicheng Yang
- College of Computer Science and Engineering, Northeastern University, Shenyang, 110000, Liaoning, China.
| | - Wei Li
- Key Laboratory of Intelligent Computing in Medical Image (MIIC), Northeastern University, Ministry of Education, Shenyang, 110000, Liaoning, China; National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Shenyang, 110819, Liaoning, China.
| | - Weidong Xie
- College of Computer Science and Engineering, Northeastern University, Shenyang, 110000, Liaoning, China.
| | - Linjie Wang
- College of Computer Science and Engineering, Northeastern University, Shenyang, 110000, Liaoning, China.
| | - Kun Yu
- College of Medicine and Bioinformation Engineering, Northeastern University, Shenyang, 110819, Liaoning, China.
| |
Collapse
|
15
|
Feda AK, Adegboye M, Adegboye OR, Agyekum EB, Fendzi Mbasso W, Kamel S. S-shaped grey wolf optimizer-based FOX algorithm for feature selection. Heliyon 2024; 10:e24192. [PMID: 38293420 PMCID: PMC10825485 DOI: 10.1016/j.heliyon.2024.e24192] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 12/09/2023] [Accepted: 01/04/2024] [Indexed: 02/01/2024] Open
Abstract
The FOX algorithm is a recently developed metaheuristic approach inspired by the behavior of foxes in their natural habitat. While the FOX algorithm exhibits commendable performance, its basic version, in complex problem scenarios, may become trapped in local optima, failing to identify the optimal solution due to its weak exploitation capabilities. This research addresses a high-dimensional feature selection problem. In feature selection, the most informative features are retained while discarding irrelevant ones. An enhanced version of the FOX algorithm is proposed, aiming to mitigate its drawbacks in feature selection. The improved approach referred to as S-shaped Grey Wolf Optimizer-based FOX (FOX-GWO), which focuses on augmenting the local search capabilities of the FOX algorithm via the integration of GWO. Additionally, the introduction of an S-shaped transfer function enables the population to explore both binary options throughout the search process. Through a series of experiments on 18 datasets with varying dimensions, FOX-GWO outperforms in 83.33 % of datasets for average accuracy, 61.11 % for reduced feature dimensionality, and 72.22 % for average fitness value across the 18 datasets. Meaning it efficiently explores high-dimensional spaces. These findings highlight its practical value and potential to advance feature selection in complex data analysis, enhancing model prediction accuracy.
Collapse
Affiliation(s)
- Afi Kekeli Feda
- Management Information System Department, European University of Lefke, Mersin, 10, Turkey
| | | | | | - Ephraim Bonah Agyekum
- Department of Nuclear and Renewable Energy, Ural Federal University named after the first President of Russia Boris Yeltsin, 620002, 19 Mira Street, Ekaterinburg, Russia
| | - Wulfran Fendzi Mbasso
- Laboratory of Technology and Applied Sciences, University Institute of Technology, University of Douala, PO Box: 8698, Douala, Cameroon
| | - Salah Kamel
- Department of Electrical Engineering, Faculty of Engineering, Aswan University, Aswan, 81542, Egypt
| |
Collapse
|
16
|
Abdelrazek M, Abd Elaziz M, El-Baz AH. CDMO: Chaotic Dwarf Mongoose Optimization Algorithm for feature selection. Sci Rep 2024; 14:701. [PMID: 38184680 PMCID: PMC10771514 DOI: 10.1038/s41598-023-50959-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Accepted: 12/28/2023] [Indexed: 01/08/2024] Open
Abstract
In this paper, a modified version of Dwarf Mongoose Optimization Algorithm (DMO) for feature selection is proposed. DMO is a novel technique of the swarm intelligence algorithms which mimic the foraging behavior of the Dwarf Mongoose. The developed method, named Chaotic DMO (CDMO), is considered a wrapper-based model which selects optimal features that give higher classification accuracy. To speed up the convergence and increase the effectiveness of DMO, ten chaotic maps were used to modify the key elements of Dwarf Mongoose movement during the optimization process. To evaluate the efficiency of the CDMO, ten different UCI datasets are used and compared against the original DMO and other well-known Meta-heuristic techniques, namely Ant Colony optimization (ACO), Whale optimization algorithm (WOA), Artificial rabbit optimization (ARO), Harris hawk optimization (HHO), Equilibrium optimizer (EO), Ring theory based harmony search (RTHS), Random switching serial gray-whale optimizer (RSGW), Salp swarm algorithm based on particle swarm optimization (SSAPSO), Binary genetic algorithm (BGA), Adaptive switching gray-whale optimizer (ASGW) and Particle Swarm optimization (PSO). The experimental results show that the CDMO gives higher performance than the other methods used in feature selection. High value of accuracy (91.9-100%), sensitivity (77.6-100%), precision (91.8-96.08%), specificity (91.6-100%) and F-Score (90-100%) for all ten UCI datasets are obtained. In addition, the proposed method is further assessed against CEC'2022 benchmarks functions.
Collapse
Affiliation(s)
- Mohammed Abdelrazek
- Department of Mathematics, Faculty of Science, Damietta University, New Damietta, 34517, Egypt
| | - Mohamed Abd Elaziz
- Department of Mathematics, Faculty of Science, Zagazig University, Zagazig, 44519, Egypt
- Artificial Intelligence Research Center (AIRC), Ajman University, Ajman 346, UAE
- MEU Research Unit, Middle East University, Amman, 11831, Jordan
- Department of Electrical and Computer Engineering, Lebanese American University, Byblos, 13-5053, Lebanon
| | - A H El-Baz
- Department of Computer Science, Faculty of Computers and Artificial Intelligence, Damietta University, New Damietta, 34517, Egypt.
| |
Collapse
|
17
|
Liu G, Guo Z, Liu W, Jiang F, Fu E. A feature selection method based on the Golden Jackal-Grey Wolf Hybrid Optimization Algorithm. PLoS One 2024; 19:e0295579. [PMID: 38165924 PMCID: PMC10760777 DOI: 10.1371/journal.pone.0295579] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Accepted: 11/20/2023] [Indexed: 01/04/2024] Open
Abstract
This paper proposes a feature selection method based on a hybrid optimization algorithm that combines the Golden Jackal Optimization (GJO) and Grey Wolf Optimizer (GWO). The primary objective of this method is to create an effective data dimensionality reduction technique for eliminating redundant, irrelevant, and noisy features within high-dimensional datasets. Drawing inspiration from the Chinese idiom "Chai Lang Hu Bao," hybrid algorithm mechanisms, and cooperative behaviors observed in natural animal populations, we amalgamate the GWO algorithm, the Lagrange interpolation method, and the GJO algorithm to propose the multi-strategy fusion GJO-GWO algorithm. In Case 1, the GJO-GWO algorithm addressed eight complex benchmark functions. In Case 2, GJO-GWO was utilized to tackle ten feature selection problems. Experimental results consistently demonstrate that under identical experimental conditions, whether solving complex benchmark functions or addressing feature selection problems, GJO-GWO exhibits smaller means, lower standard deviations, higher classification accuracy, and reduced execution times. These findings affirm the superior optimization performance, classification accuracy, and stability of the GJO-GWO algorithm.
Collapse
Affiliation(s)
- Guangwei Liu
- College of Mining, Liaoning Technical University, Fuxin, Liaoning, China
| | - Zhiqing Guo
- College of Mining, Liaoning Technical University, Fuxin, Liaoning, China
| | - Wei Liu
- College of Science, Liaoning Technical University, Fuxin, Liaoning, China
| | - Feng Jiang
- College of Science, Liaoning Technical University, Fuxin, Liaoning, China
| | - Ensan Fu
- College of Mining, Liaoning Technical University, Fuxin, Liaoning, China
| |
Collapse
|
18
|
Barrera-García J, Cisternas-Caneo F, Crawford B, Gómez Sánchez M, Soto R. Feature Selection Problem and Metaheuristics: A Systematic Literature Review about Its Formulation, Evaluation and Applications. Biomimetics (Basel) 2023; 9:9. [PMID: 38248583 PMCID: PMC10813816 DOI: 10.3390/biomimetics9010009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2023] [Revised: 12/16/2023] [Accepted: 12/18/2023] [Indexed: 01/23/2024] Open
Abstract
Feature selection is becoming a relevant problem within the field of machine learning. The feature selection problem focuses on the selection of the small, necessary, and sufficient subset of features that represent the general set of features, eliminating redundant and irrelevant information. Given the importance of the topic, in recent years there has been a boom in the study of the problem, generating a large number of related investigations. Given this, this work analyzes 161 articles published between 2019 and 2023 (20 April 2023), emphasizing the formulation of the problem and performance measures, and proposing classifications for the objective functions and evaluation metrics. Furthermore, an in-depth description and analysis of metaheuristics, benchmark datasets, and practical real-world applications are presented. Finally, in light of recent advances, this review paper provides future research opportunities.
Collapse
Affiliation(s)
- José Barrera-García
- Escuela de Ingeniería Informática, Pontificia Universidad Católica de Valparaíso, Avenida Brasil 2241, Valparaíso 2362807, Chile; (J.B.-G.); (F.C.-C.); (R.S.)
| | - Felipe Cisternas-Caneo
- Escuela de Ingeniería Informática, Pontificia Universidad Católica de Valparaíso, Avenida Brasil 2241, Valparaíso 2362807, Chile; (J.B.-G.); (F.C.-C.); (R.S.)
| | - Broderick Crawford
- Escuela de Ingeniería Informática, Pontificia Universidad Católica de Valparaíso, Avenida Brasil 2241, Valparaíso 2362807, Chile; (J.B.-G.); (F.C.-C.); (R.S.)
| | - Mariam Gómez Sánchez
- Departamento de Electrotecnia e Informática, Universidad Técnica Federico Santa María, Federico Santa María 6090, Viña del Mar 2520000, Chile;
| | - Ricardo Soto
- Escuela de Ingeniería Informática, Pontificia Universidad Católica de Valparaíso, Avenida Brasil 2241, Valparaíso 2362807, Chile; (J.B.-G.); (F.C.-C.); (R.S.)
| |
Collapse
|
19
|
Sharma R, Mahanti GK, Panda G, Rath A, Dash S, Mallik S, Zhao Z. Comparative performance analysis of binary variants of FOX optimization algorithm with half-quadratic ensemble ranking method for thyroid cancer detection. Sci Rep 2023; 13:19598. [PMID: 37950041 PMCID: PMC10638362 DOI: 10.1038/s41598-023-46865-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 11/06/2023] [Indexed: 11/12/2023] Open
Abstract
Thyroid cancer is a life-threatening condition that arises from the cells of the thyroid gland located in the neck's frontal region just below the adam's apple. While it is not as prevalent as other types of cancer, it ranks prominently among the commonly observed cancers affecting the endocrine system. Machine learning has emerged as a valuable medical diagnostics tool specifically for detecting thyroid abnormalities. Feature selection is of vital importance in the field of machine learning as it serves to decrease the data dimensionality and concentrate on the most pertinent features. This process improves model performance, reduces training time, and enhances interpretability. This study examined binary variants of FOX-optimization algorithms for feature selection. The study employed eight transfer functions (S and V shape) to convert the FOX-optimization algorithms into their binary versions. The vision transformer-based pre-trained models (DeiT and Swin Transformer) are used for feature extraction. The extracted features are transformed using locally linear embedding, and binary FOX-optimization algorithms are applied for feature selection in conjunction with the Naïve Bayes classifier. The study utilized two datasets (ultrasound and histopathological) related to thyroid cancer images. The benchmarking is performed using the half-quadratic theory-based ensemble ranking technique. Two TOPSIS-based methods (H-TOPSIS and A-TOPSIS) are employed for initial model ranking, followed by an ensemble technique for final ranking. The problem is treated as multi-objective optimization task with accuracy, F2-score, AUC-ROC and feature space size as optimization goals. The binary FOX-optimization algorithm based on the [Formula: see text] transfer function achieved superior performance compared to other variants using both datasets as well as feature extraction techniques. The proposed framework comprised a Swin transformer to extract features, a Fox optimization algorithm with a V1 transfer function for feature selection, and a Naïve Bayes classifier and obtained the best performance for both datasets. The best model achieved an accuracy of 94.75%, an AUC-ROC value of 0.9848, an F2-Score of 0.9365, an inference time of 0.0353 seconds, and selected 5 features for the ultrasound dataset. For the histopathological dataset, the diagnosis model achieved an overall accuracy of 89.71%, an AUC-ROC score of 0.9329, an F2-Score of 0.8760, an inference time of 0.05141 seconds, and selected 12 features. The proposed model achieved results comparable to existing research with small features space.
Collapse
Affiliation(s)
- Rohit Sharma
- Department of Electronics and Communication Engineering, NIT, Durgapur, 713209, India
| | - Gautam Kumar Mahanti
- Department of Electronics and Communication Engineering, NIT, Durgapur, 713209, India
| | - Ganapati Panda
- Department of Electronics and Communication Engineering, C.V. Raman Global University, Bhubaneswar, 752054, India
| | - Adyasha Rath
- Department of Computer Science and Engineering, C.V. Raman Global University, Bhubaneswar, 752054, India
| | - Sujata Dash
- Department of Information Technology, Nagaland University, Dimapur, India
| | - Saurav Mallik
- Department of Environmental Health, Harvard T. H. Chan School of Public Health, Boston, MA, USA.
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
| |
Collapse
|
20
|
Moslemi A, Bidar M, Ahmadian A. Subspace learning using structure learning and non-convex regularization: Hybrid technique with mushroom reproduction optimization in gene selection. Comput Biol Med 2023; 164:107309. [PMID: 37536092 DOI: 10.1016/j.compbiomed.2023.107309] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 07/26/2023] [Accepted: 07/28/2023] [Indexed: 08/05/2023]
Abstract
Gene selection as a problem with high dimensions has drawn considerable attention in machine learning and computational biology over the past decade. In the field of gene selection in cancer datasets, different types of feature selection techniques in terms of strategy (filter, wrapper and embedded) and label information (supervised, unsupervised, and semi-supervised) have been developed. However, using hybrid feature selection can still improve the performance. In this paper, we propose a hybrid feature selection based on filter and wrapper strategies. In the filter-phase, we develop an unsupervised features selection based on non-convex regularized non-negative matrix factorization and structure learning, which we deem NCNMFSL. In the wrapper-phase, for the first time, mushroom reproduction optimization (MRO) is leveraged to obtain the most informative features subset. In this hybrid feature selection method, irrelevant features are filtered-out through NCNMFSL, and most discriminative features are selected by MRO. To show the effectiveness and proficiency of the proposed method, numerical experiments are conducted on Breast, Heart, Colon, Leukemia, Prostate, Tox-171 and GLI-85 benchmark datasets. SVM and decision tree classifiers are leveraged to analyze proposed technique and top accuracy are 0.97, 0.84, 0.98, 0.95, 0.98, 0.87 and 0.85 for Breast, Heart, Colon, Leukemia, Prostate, Tox-171 and GLI-85, respectively. The computational results show the effectiveness of the proposed method in comparison with state-of-art feature selection techniques.
Collapse
Affiliation(s)
- Amir Moslemi
- Department of Physics, Ryerson University, Toronto, ON, Canada.
| | - Mahdi Bidar
- Department of Computer Science, University of Regina, Regina, Canada
| | - Arash Ahmadian
- Edward S. Rogers Sr. Department of Electrical and Computer Engineering, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
21
|
Dokeroglu T. A new parallel multi-objective Harris hawk algorithm for predicting the mortality of COVID-19 patients. PeerJ Comput Sci 2023; 9:e1430. [PMID: 37346714 PMCID: PMC10280461 DOI: 10.7717/peerj-cs.1430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 05/18/2023] [Indexed: 06/23/2023]
Abstract
Harris' Hawk Optimization (HHO) is a novel metaheuristic inspired by the collective hunting behaviors of hawks. This technique employs the flight patterns of hawks to produce (near)-optimal solutions, enhanced with feature selection, for challenging classification problems. In this study, we propose a new parallel multi-objective HHO algorithm for predicting the mortality risk of COVID-19 patients based on their symptoms. There are two objectives in this optimization problem: to reduce the number of features while increasing the accuracy of the predictions. We conduct comprehensive experiments on a recent real-world COVID-19 dataset from Kaggle. An augmented version of the COVID-19 dataset is also generated and experimentally shown to improve the quality of the solutions. Significant improvements are observed compared to existing state-of-the-art metaheuristic wrapper algorithms. We report better classification results with feature selection than when using the entire set of features. During experiments, a 98.15% prediction accuracy with a 45% reduction is achieved in the number of features. We successfully obtained new best solutions for this COVID-19 dataset.
Collapse
Affiliation(s)
- Tansel Dokeroglu
- Cankaya University, Software Engineering Department, Ankara, Turkey
| |
Collapse
|
22
|
Liu D, Zhang X, Zhang Z, Jiang H. A Hybrid Feature Selection and Multi-Label Driven Intelligent Fault Diagnosis Method for Gearbox. SENSORS (BASEL, SWITZERLAND) 2023; 23:4792. [PMID: 37430707 DOI: 10.3390/s23104792] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 05/11/2023] [Accepted: 05/13/2023] [Indexed: 07/12/2023]
Abstract
Gearboxes are utilized in practically all complicated machinery equipment because they have great transmission accuracy and load capacities, so their failure frequently results in significant financial losses. The classification of high-dimensional data remains a difficult topic despite the fact that numerous data-driven intelligent diagnosis approaches have been suggested and employed for compound fault diagnosis in recent years with successful outcomes. In order to achieve the best diagnostic performance as the ultimate objective, a feature selection and fault decoupling framework is proposed in this paper. That is based on multi-label K-nearest neighbors (ML-kNN) as classifiers and can automatically determine the optimal subset from the original high-dimensional feature set. The proposed feature selection method is a hybrid framework that can be divided into three stages. The Fisher score, information gain, and Pearson's correlation coefficient are three filter models that are used in the first stage to pre-rank candidate features. In the second stage, a weighting scheme based on the weighted average method is proposed to fuse the pre-ranking results obtained in the first stage and optimize the weights using a genetic algorithm to re-rank the features. The optimal subset is automatically and iteratively found in the third stage using three heuristic strategies, including binary search, sequential forward search, and sequential backward search. The method takes into account the consideration of feature irrelevance, redundancy and inter-feature interaction in the selection process, and the selected optimal subsets have better diagnostic performance. In two gearbox compound fault datasets, ML-kNN performs exceptionally well using the optimal subset with subset accuracy of 96.22% and 100%. The experimental findings demonstrate the effectiveness of the proposed method in predicting various labels for compound fault samples to identify and decouple compound faults. The proposed method performs better in terms of classification accuracy and optimal subset dimensionality when compared to other existing methods.
Collapse
Affiliation(s)
- Di Liu
- College of Intelligent Manufacturing and Industrial Modernization, Xinjiang University, Urumchi 830017, China
| | - Xiangfeng Zhang
- College of Intelligent Manufacturing and Industrial Modernization, Xinjiang University, Urumchi 830017, China
| | - Zhiyu Zhang
- College of Intelligent Manufacturing and Industrial Modernization, Xinjiang University, Urumchi 830017, China
| | - Hong Jiang
- College of Intelligent Manufacturing and Industrial Modernization, Xinjiang University, Urumchi 830017, China
| |
Collapse
|
23
|
Ali MU, Hussain SJ, Zafar A, Bhutta MR, Lee SW. WBM-DLNets: Wrapper-Based Metaheuristic Deep Learning Networks Feature Optimization for Enhancing Brain Tumor Detection. Bioengineering (Basel) 2023; 10:bioengineering10040475. [PMID: 37106662 PMCID: PMC10135892 DOI: 10.3390/bioengineering10040475] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 04/07/2023] [Accepted: 04/11/2023] [Indexed: 04/29/2023] Open
Abstract
This study presents wrapper-based metaheuristic deep learning networks (WBM-DLNets) feature optimization algorithms for brain tumor diagnosis using magnetic resonance imaging. Herein, 16 pretrained deep learning networks are used to compute the features. Eight metaheuristic optimization algorithms, namely, the marine predator algorithm, atom search optimization algorithm (ASOA), Harris hawks optimization algorithm, butterfly optimization algorithm, whale optimization algorithm, grey wolf optimization algorithm (GWOA), bat algorithm, and firefly algorithm, are used to evaluate the classification performance using a support vector machine (SVM)-based cost function. A deep-learning network selection approach is applied to determine the best deep-learning network. Finally, all deep features of the best deep learning networks are concatenated to train the SVM model. The proposed WBM-DLNets approach is validated based on an available online dataset. The results reveal that the classification accuracy is significantly improved by utilizing the features selected using WBM-DLNets relative to those obtained using the full set of deep features. DenseNet-201-GWOA and EfficientNet-b0-ASOA yield the best results, with a classification accuracy of 95.7%. Additionally, the results of the WBM-DLNets approach are compared with those reported in the literature.
Collapse
Affiliation(s)
- Muhammad Umair Ali
- Department of Intelligent Mechatronics Engineering, Sejong University, Seoul 05006, Republic of Korea
| | - Shaik Javeed Hussain
- Department of Electrical and Electronics, Global College of Engineering and Technology, Muscat 112, Oman
| | - Amad Zafar
- Department of Intelligent Mechatronics Engineering, Sejong University, Seoul 05006, Republic of Korea
| | - Muhammad Raheel Bhutta
- Department of Electrical and Computer Engineering, University of UTAH Asia Campus, Incheon 21985, Republic of Korea
| | - Seung Won Lee
- Department of Precision Medicine, Sungkyunkwan University School of Medicine, Suwon 16419, Republic of Korea
| |
Collapse
|
24
|
Zafar A, Hussain SJ, Ali MU, Lee SW. Metaheuristic Optimization-Based Feature Selection for Imagery and Arithmetic Tasks: An fNIRS Study. SENSORS (BASEL, SWITZERLAND) 2023; 23:s23073714. [PMID: 37050774 PMCID: PMC10098559 DOI: 10.3390/s23073714] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 03/23/2023] [Accepted: 03/30/2023] [Indexed: 06/01/2023]
Abstract
In recent decades, the brain-computer interface (BCI) has emerged as a leading area of research. The feature selection is vital to reduce the dataset's dimensionality, increase the computing effectiveness, and enhance the BCI's performance. Using activity-related features leads to a high classification rate among the desired tasks. This study presents a wrapper-based metaheuristic feature selection framework for BCI applications using functional near-infrared spectroscopy (fNIRS). Here, the temporal statistical features (i.e., the mean, slope, maximum, skewness, and kurtosis) were computed from all the available channels to form a training vector. Seven metaheuristic optimization algorithms were tested for their classification performance using a k-nearest neighbor-based cost function: particle swarm optimization, cuckoo search optimization, the firefly algorithm, the bat algorithm, flower pollination optimization, whale optimization, and grey wolf optimization (GWO). The presented approach was validated based on an available online dataset of motor imagery (MI) and mental arithmetic (MA) tasks from 29 healthy subjects. The results showed that the classification accuracy was significantly improved by utilizing the features selected from the metaheuristic optimization algorithms relative to those obtained from the full set of features. All of the abovementioned metaheuristic algorithms improved the classification accuracy and reduced the feature vector size. The GWO yielded the highest average classification rates (p < 0.01) of 94.83 ± 5.5%, 92.57 ± 6.9%, and 85.66 ± 7.3% for the MA, MI, and four-class (left- and right-hand MI, MA, and baseline) tasks, respectively. The presented framework may be helpful in the training phase for selecting the appropriate features for robust fNIRS-based BCI applications.
Collapse
Affiliation(s)
- Amad Zafar
- Department of Intelligent Mechatronics Engineering, Sejong University, Seoul 05006, Republic of Korea
| | - Shaik Javeed Hussain
- Department of Electrical and Electronics, Global College of Engineering and Technology, Muscat 112, Oman
| | - Muhammad Umair Ali
- Department of Intelligent Mechatronics Engineering, Sejong University, Seoul 05006, Republic of Korea
| | - Seung Won Lee
- Department of Precision Medicine, School of Medicine, Sungkyunkwan University, Suwon 16419, Republic of Korea
| |
Collapse
|
25
|
Turgut OE, Turgut MS, Kırtepe E. A systematic review of the emerging metaheuristic algorithms on solving complex optimization problems. Neural Comput Appl 2023. [DOI: 10.1007/s00521-023-08481-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
26
|
Universal Feature Selection Tool (UniFeat): An Open-Source Tool for Dimensionality Reduction. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.03.037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
|
27
|
Sun L, Si S, Ding W, Xu J, Zhang Y. BSSFS: binary sparrow search algorithm for feature selection. INT J MACH LEARN CYB 2023. [DOI: 10.1007/s13042-023-01788-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
|
28
|
Oda T. A Delaunay Edges and Simulated Annealing-Based Integrated Approach for Mesh Router Placement Optimization in Wireless Mesh Networks. SENSORS (BASEL, SWITZERLAND) 2023; 23:1050. [PMID: 36772090 PMCID: PMC9920083 DOI: 10.3390/s23031050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 12/24/2022] [Accepted: 01/14/2023] [Indexed: 06/18/2023]
Abstract
Wireless Mesh Networks (WMNs) can build a communications infrastructure using only routers (called mesh routers), making it possible to form networks over a wide area at low cost. The mesh routers cover clients (called mesh clients), allowing mesh clients to communicate with different nodes. Since the communication performance of WMNs is affected by the position of mesh routers, the communication performance can be improved by optimizing the mesh router placement. In this paper, we present a Coverage Construction Method (CCM) that optimizes mesh router placement. In addition, we propose an integrated optimization approach that combine Simulated Annealing (SA) and Delaunay Edges (DE) in CCM to improve the performance of mesh router placement optimization. The proposed approach can build and provide a communication infrastructure by WMNs in disaster environments. We consider a real scenario for the placement of mesh clients in an evacuation area of Kurashiki City, Japan. From the simulation results, we found that the proposed approach can optimize the placement of mesh routers in order to cover all mesh clients in the evacuation area. Additionally, the DECCM-based SA approach covers more mesh clients than the CCM-based SA approach on average and can improve network connectivity of WMNs.
Collapse
Affiliation(s)
- Tetsuya Oda
- Department of Information Engineering, Okayama University of Science (OUS), Okayama 700-0005, Japan
| |
Collapse
|
29
|
Multiview nonnegative matrix factorization with dual HSIC constraints for clustering. INT J MACH LEARN CYB 2022. [DOI: 10.1007/s13042-022-01742-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
30
|
Robust nonparallel support vector machine with privileged information for pattern recognition. INT J MACH LEARN CYB 2022. [DOI: 10.1007/s13042-022-01709-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
|
31
|
Algorithm for orthogonal matrix nearness and its application to feature representation. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.12.036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
32
|
Ibrahim AM, Tawhid MA. Chaotic electromagnetic field optimization. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10324-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
33
|
Interaction-based clustering algorithm for feature selection: a multivariate filter approach. INT J MACH LEARN CYB 2022. [DOI: 10.1007/s13042-022-01726-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
34
|
Hybrid PSO (SGPSO) with the Incorporation of Discretization Operator for Training RBF Neural Network and Optimal Feature Selection. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2022. [DOI: 10.1007/s13369-022-07408-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
35
|
Goyal S. Software fault prediction using evolving populations with mathematical diversification. Soft comput 2022. [DOI: 10.1007/s00500-022-07445-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
36
|
Lai J, Chen H, Li T, Yang X. Adaptive graph learning for semi-supervised feature selection with redundancy minimization. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.07.102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
37
|
Liu P, Han S, Rong N, Fan J. Frequency Stability Prediction of Power Systems Using Vision Transformer and Copula Entropy. ENTROPY (BASEL, SWITZERLAND) 2022; 24:1165. [PMID: 36010829 PMCID: PMC9407505 DOI: 10.3390/e24081165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Revised: 08/17/2022] [Accepted: 08/19/2022] [Indexed: 06/15/2023]
Abstract
This paper addresses the problem of frequency stability prediction (FSP) following active power disturbances in power systems by proposing a vision transformer (ViT) method that predicts frequency stability in real time. The core idea of the FSP approach employing the ViT is to use the time-series data of power system operations as ViT inputs to perform FSP accurately and quickly so that operators can decide frequency control actions, minimizing the losses caused by incidents. Additionally, due to the high-dimensional and redundant input data of the power system and the O(N2) computational complexity of the transformer, feature selection based on copula entropy (CE) is used to construct image-like data with fixed dimensions from power system operation data and remove redundant information. Moreover, no previous FSP study has taken safety margins into consideration, which may threaten the secure operation of power systems. Therefore, a frequency security index (FSI) is used to form the sample labels, which are categorized as "insecurity", "relative security", and "absolute security". Finally, various case studies are carried out on a modified New England 39-bus system and a modified ACTIVSg500 system for projected 0% to 40% nonsynchronous system penetration levels. The simulation results demonstrate that the proposed method achieves state-of-the-art (SOTA) performance on normal, noisy, and incomplete datasets in comparison with eight machine-learning methods.
Collapse
Affiliation(s)
- Peili Liu
- Department of Electrical Engineering, Guizhou University, Guiyang 550025, China
| | - Song Han
- Department of Electrical Engineering, Guizhou University, Guiyang 550025, China
| | - Na Rong
- Department of Electrical Engineering, Guizhou University, Guiyang 550025, China
| | - Junqiu Fan
- Department of Electrical Engineering, Guizhou University, Guiyang 550025, China
- Guian Company Guizhou Power Grid, Guiyang 550003, China
| |
Collapse
|
38
|
Riyahi M, Rafsanjani MK, Gupta BB, Alhalabi W. Multiobjective whale optimization algorithm‐based feature selection for intelligent systems. INT J INTELL SYST 2022. [DOI: 10.1002/int.22979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Milad Riyahi
- Department of Computer Science, Faculty of Mathematics and Computer Shahid Bahonar University of Kerman Kerman Iran
| | - Marjan K. Rafsanjani
- Department of Computer Science, Faculty of Mathematics and Computer Shahid Bahonar University of Kerman Kerman Iran
| | - Brij B. Gupta
- Department of Computer Science and Information Engineering Asia University Taichung Taiwan
- Lebanese American University Beirut Lebanon
- Center for Interdisciplinary Research UPES Dehradun India
- Research and Innovation Department Skyline University College Sharjah United Arab Emirates
| | - Wadee Alhalabi
- Department of Electrical and Computer Engineering University of Miami Coral Gables Florida USA
| |
Collapse
|
39
|
Praveen S, Tyagi N, Singh B, Karetla GR, Thalor MA, Joshi K, Tsegaye M. PSO-Based Evolutionary Approach to Optimize Head and Neck Biomedical Image to Detect Mesothelioma Cancer. BIOMED RESEARCH INTERNATIONAL 2022; 2022:3618197. [PMID: 36033562 PMCID: PMC9410819 DOI: 10.1155/2022/3618197] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 06/30/2022] [Accepted: 07/21/2022] [Indexed: 11/17/2022]
Abstract
Mesothelioma is a form of cancer that is aggressive and fatal. It is a thin layer of tissue that covers the majority of the patient's internal organs. The treatments are available; however, a cure is not attainable for the majority of patients. So, a lot of research is being done on detection of mesothelioma cancer using various different approaches; but this paper focuses on optimization techniques for optimizing the biomedical images to detect the cancer. With the restricted number of samples in the medical field, a Relief-PSO head and mesothelioma neck cancer pathological image feature selection approach is proposed. The approach reduces multilevel dimensionality. To begin, the relief technique picks different feature weights depending on the relationship between features and categories. Second, the hybrid binary particle swarm optimization (HBPSO) is suggested to automatically determine the optimum feature subset for candidate feature subsets. The technique outperforms seven other feature selection algorithms in terms of morphological feature screening, dimensionality reduction, and classification performance.
Collapse
Affiliation(s)
| | - Neha Tyagi
- Department of IT, G.L Bajaj Institute of Technology & Management, Greater Noida, India
| | - Bhagwant Singh
- Informatics Cluster, School of Computer Science, University of Petroleum and Energy Studies (UPES) Dehradun, Uttrakhand, 248007, India
| | - Girija Rani Karetla
- School of Computer, Data and Mathematical Sciences, Western Sydney University, Sydney, Australia
| | | | | | | |
Collapse
|