1
|
Emam MM, Houssein EH, Samee NA, Alkhalifa AK, Hosney ME. Optimizing cancer diagnosis: A hybrid approach of genetic operators and Sinh Cosh Optimizer for tumor identification and feature gene selection. Comput Biol Med 2024; 180:108984. [PMID: 39128177 DOI: 10.1016/j.compbiomed.2024.108984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Revised: 06/30/2024] [Accepted: 08/02/2024] [Indexed: 08/13/2024]
Abstract
The identification of tumors through gene analysis in microarray data is a pivotal area of research in artificial intelligence and bioinformatics. This task is challenging due to the large number of genes relative to the limited number of observations, making feature selection a critical step. This paper introduces a novel wrapper feature selection method that leverages a hybrid optimization algorithm combining a genetic operator with a Sinh Cosh Optimizer (SCHO), termed SCHO-GO. The SCHO-GO algorithm is designed to avoid local optima, streamline the search process, and select the most relevant features without compromising classifier performance. Traditional methods often falter with extensive search spaces, necessitating hybrid approaches. Our method aims to reduce the dimensionality and improve the classification accuracy, which is essential in pattern recognition and data analysis. The SCHO-GO algorithm, integrated with a support vector machine (SVM) classifier, significantly enhances cancer classification accuracy. We evaluated the performance of SCHO-GO using the CEC'2022 benchmark function and compared it with seven well-known metaheuristic algorithms. Statistical analyses indicate that SCHO-GO consistently outperforms these algorithms. Experimental tests on eight microarray gene expression datasets, particularly the Gene Expression Cancer RNA-Seq dataset, demonstrate an impressive accuracy of 99.01% with the SCHO-GO-SVM model, highlighting its robustness and precision in handling complex datasets. Furthermore, the SCHO-GO algorithm excels in feature selection and solving mathematical benchmark problems, presenting a promising approach for tumor identification and classification in microarray data analysis.
Collapse
Affiliation(s)
- Marwa M Emam
- Faculty of Computers and Information, Minia University, Minia, Egypt.
| | - Essam H Houssein
- Faculty of Computers and Information, Minia University, Minia, Egypt.
| | - Nagwan Abdel Samee
- Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia.
| | - Amal K Alkhalifa
- Department of Computer Science and Information Technology, Applied College, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia.
| | - Mosa E Hosney
- Faculty of Computers and Information, Luxor University, Luxor, Egypt.
| |
Collapse
|
2
|
Shobana M, Balasraswathi VR, Radhika R, Oleiwi AK, Chaudhury S, Ladkat AS, Naved M, Rahmani AW. Classification and Detection of Mesothelioma Cancer Using Feature Selection-Enabled Machine Learning Technique. BIOMED RESEARCH INTERNATIONAL 2022; 2022:9900668. [PMID: 35937383 PMCID: PMC9348925 DOI: 10.1155/2022/9900668] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 06/30/2022] [Accepted: 07/14/2022] [Indexed: 11/18/2022]
Abstract
Cancer of the mesothelium, sometimes referred to as malignant mesothelioma (MM), is an extremely uncommon form of the illness that almost always results in death. Chemotherapy, surgery, radiation therapy, and immunotherapy are all potential treatments for multiple myeloma; however, the majority of patients are identified with the disease at an advanced stage, at which time it is resistant to these therapies. After obtaining a diagnosis of advanced multiple myeloma, the average length of time that a person lives is one year after hearing this news. There is a substantial link between asbestos exposure and mesothelioma (MM). Using an approach that enables feature selection and machine learning, this article proposes a classification and detection method for mesothelioma cancer. The CFS correlation-based feature selection approach is first used in the feature selection process. It acts as a filter, selecting just the traits that are relevant to the categorization. The accuracy of the categorization model is improved as a direct consequence of this. After that, classification is carried out with the help of naive Bayes, fuzzy SVM, and the ID3 algorithm. Various metrics have been utilized during the process of measuring the effectiveness of machine learning strategies. It has been discovered that the choice of features has a substantial influence on the accuracy of the categorization.
Collapse
Affiliation(s)
- M. Shobana
- SRM Institute of Science and Technology, SRM Nagar, Kattankulathur, Kanchipuram, 603203, Chennai, India
| | - V. R. Balasraswathi
- Department of Networking and Communications, School of Computing, SRM Institute of Science and Technology, Kattankulathur, India
| | - R. Radhika
- Department of Networking and Communications, School of Computing, SRM Institute of Science and Technology, Kattankulathur, India
| | - Ahmed Kareem Oleiwi
- Department of Computer Technical Engineering, The Islamic University, 54001 Najaf, Iraq
| | | | - Ajay S. Ladkat
- Department of Instrumentation Engineering, Vishwakarma Institute of Technology, Pune, India
| | - Mohd Naved
- Amity International Business School (AIBS), Amity University, Noida, India
| | | |
Collapse
|
3
|
Senthil Kumar J, Balamurugan SAA, Sasikala S. A Novel Tuberculosis Prediction Model by Extracting Radiological Features Present in Chest X-ray Images Using Modified Discrete Grey Wolf Optimizer Based Segmentation. JOURNAL OF MEDICAL IMAGING AND HEALTH INFORMATICS 2021. [DOI: 10.1166/jmihi.2021.3837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
In 2018, an invariant numbers ranging from 10 million people suffered from Tuberculosis (TB) approximately that has remained quite stable in recent years, based on the WHO 2019 survey report. This infection rate differs invariable among countries, from less than 5 to more than 500 new
infections per 1,00,000 people each year, with a global average of around 130. Around 1.2 million HIV negative deaths existed in 2018. If this prevailing disease were diagnosed earlier, the death rate would have been under control, however sophisticated testing techniques tend to be cost prohibitive
of wider acceptance. Some of the most important methods for TB diagnosis include thoracic X-ray image interpretation through image processing by the identification of various structures on thoracic X-rays and anomaly assessment is an important stage in computer-aided diagnosis systems. Chest
form and size may contain indications for serious disorders such as pneumothorax, pneumoconiosis, tuberculosis and emphysema. Substantial work might have contributed to simplify diagnosis through implementing various statistical strategies to medical images, minimizing overtime and dramatically
lowering overhead costs. In addition, recent advances in deep learning have provided magnificent results in the detection of images in different fields, but their use in diagnose TB remains limited. Thus, this work focuses on the development of a novel approach in disease detection. The concepts
presented in this work are placed into practice and linked to current literature. We also proposed an automatic approach in conventional poster anterior chest X-rays for TB identification and diagnosis. We use the chest X-ray image with modified discrete grey wolf optimizer for segmentation
techniques to eradicate abnormal areas and shape abnormality. We extract various features from the X-ray image with a shear let extraction that allows the image to be classified as normal or abnormal, based on a deep learning classifier, via the improved residual VGG net CNN with big data.
Using Shenzhen Hospital Chest X-ray data set we test the efficiency of our system. The suggested technique has competitive results with comparatively shorter training period and greater precision depending on Masientropy based discrete gray wolf optimizer segmentation with an improved residual
VGG net CNN. All the simulations are carried out in a mat lab environment.
Collapse
Affiliation(s)
- J. Senthil Kumar
- Department of Computer Science, Kalaignar Karunanidhi Institute of Technology, Pallapalayam, Kannampalayam 641402, Tamil Nadu, India
| | | | - S. Sasikala
- Department of Computer Science and Engineering, Velammal College of Engineering and Technology, Madurai 625009, Tamil Nadu, India
| |
Collapse
|
4
|
Multi-variant differential evolution algorithm for feature selection. Sci Rep 2020; 10:17261. [PMID: 33057120 PMCID: PMC7560894 DOI: 10.1038/s41598-020-74228-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Accepted: 09/28/2020] [Indexed: 11/29/2022] Open
Abstract
This work introduces a new population-based stochastic search technique, named multi-variant differential evolution (MVDE) algorithm for solving fifteen well-known real world problems from UCI repository and compared to four popular optimization methods. The MVDE proposes a new self-adaptive scaling factor based on cosine and logistic distributions as an almost factor-free optimization technique. For more updated chances, this factor is binary-mapped by incorporating an adaptive crossover operator. During the evolution, both greedy and less-greedy variants are managed by adjusting and incorporating the binary scaling factor and elite identification mechanism into a new multi-mutation crossover process through a number of sequentially evolutionary phases. Feature selection decreases the number of features by eliminating irrelevant or misleading, noisy and redundant data which can accelerate the process of classification. In this paper, a new feature selection algorithm based on the MVDE method and artificial neural network is presented which enabled MVDE to get a combination features’ set, accelerate the accuracy of the classification, and optimize both the structure and weights of Artificial Neural Network (ANN) simultaneously. The experimental results show the encouraging behavior of the proposed algorithm in terms of the classification accuracies and optimal number of feature selection.
Collapse
|
5
|
Basavegowda HS, Dagnew G. Deep learning approach for microarray cancer data classification. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY 2020. [DOI: 10.1049/trit.2019.0028] [Citation(s) in RCA: 110] [Impact Index Per Article: 27.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Affiliation(s)
- Hema Shekar Basavegowda
- Department of Studies and Research in Computer ScienceMangalore UniversityMangaloreKarnatakaIndia
| | - Guesh Dagnew
- Department of Studies and Research in Computer ScienceMangalore UniversityMangaloreKarnatakaIndia
| |
Collapse
|
6
|
Appavu alias Balamurugan S, Nancy SG. An Efficient Feature Selection and Classification Using Optimal Radial Basis Function Neural Network. INT J UNCERTAIN FUZZ 2018. [DOI: 10.1142/s0218488518500320] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Feature selection is the process of identifying and removing many irrelevant and redundant features. Irrelevant features, along with redundant features, severely affect the accuracy of the learning machines. In high dimensional space finding clusters of data objects is challenging due to the curse of dimensionality. When the dimensionality increases, data in the irrelevant dimensions may produce much noise. And also, time complexity is the major issues in existing approach. In order to rectify these issues our proposed method made use of efficient feature subset selection in high dimensional data. Here we are considering the input dataset is the high dimensional micro array dataset. Initially, we have to select the optimal features so that our proposed technique employed Modified Social Spider Optimization (MSSO) algorithm. Here the traditional Social Spider Optimization is modified with the help of fruit fly optimization algorithm. Next the selected features are the input for the classifier. Here the classification is performed using Optimized Radial basis Function based neural network (ORBFNN) technique to classify the micro array data as normal or abnormal data. The effectiveness of RBFNN is optimized by means of artificial bee colony algorithm (ABC). Experimental results indicate that the proposed classification framework have outperformed by having better accuracy for five benchmark dataset 93.66%, 97.09%, 98.66%, 98.28% and 98.93% which is minimum value when compared to the existing technique. The proposed method is executed in MATLAB platform.
Collapse
Affiliation(s)
| | - S. Gilbert Nancy
- Department of Computer Science and Engineering, Bannari Amman Institute of Technology, Sathyamangalam, Erode District, Tamil Nadu, India
| |
Collapse
|