1
|
Alsaykhan LK, Maashi MS. A hybrid detection model for acute lymphocytic leukemia using support vector machine and particle swarm optimization (SVM-PSO). Sci Rep 2024; 14:23483. [PMID: 39379598 PMCID: PMC11461623 DOI: 10.1038/s41598-024-74889-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Accepted: 09/30/2024] [Indexed: 10/10/2024] Open
Abstract
Leukemia, a hematological disease affecting the bone marrow and white blood cells (WBCs), ranks among the top ten causes of mortality worldwide. Delays in decision-making often hinder the timely application of suitable medical treatments. Acute lymphoblastic leukemia (ALL) is one of the primary forms, constituting approximately 25% of childhood cancer cases. However, automated ALL diagnosis is challenging. Recently, machine learning (ML) has emerged as an important tool for building detection models. In this study, we present a hybrid detection model that improves the accuracy of the detection process by combining support vector machine (SVM) and particle swarm optimization (PSO) approaches to automatically identify ALL. We use SVM to represent a two-dimensional image and complete the classification process. PSO is employed to enhance the performance of the SVM model, reducing error rates and enhancing result accuracy. The input images are obtained from two public datasets (ALL-IDB1 and ALL-IDB2), and public online datasets are utilized for training and testing the proposed model. The results indicate that our hybrid SVM-PSO model has high accuracy, outperforming stand-alone algorithms and demonstrating superior performance, an enhanced confusion matrix, and a higher detection rate. This advancement holds promise for enhancing the quality of technical software in the medical field using machine learning.
Collapse
Affiliation(s)
- Lama K Alsaykhan
- Department of Software Engineering, College of Computer and Information Sciences, King Saud University, Riyadh, 11451, Kingdom of Saudi Arabia
| | - Mashael S Maashi
- Department of Software Engineering, College of Computer and Information Sciences, King Saud University, Riyadh, 11451, Kingdom of Saudi Arabia.
| |
Collapse
|
2
|
Aymaz S. Boosting medical diagnostics with a novel gradient-based sample selection method. Comput Biol Med 2024; 182:109165. [PMID: 39321580 DOI: 10.1016/j.compbiomed.2024.109165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Revised: 09/09/2024] [Accepted: 09/16/2024] [Indexed: 09/27/2024]
Abstract
In the rapidly expanding landscape of medical data, the need for innovative approaches to maximize classification performance has become increasingly critical. As data volumes grow, ensuring that diagnostic systems work with accurate and relevant data is paramount for effective and generalizable classification. This study introduces a novel gradient-based sample selection method, the first of its kind in the literature, specifically designed to enhance classification accuracy by removing redundant and non-informative data. Unlike traditional methods that focus solely on feature selection, this approach integrates an advanced sample selection technique to optimize the input data, leading to more accurate and efficient diagnostics. The method is validated on multiple disease datasets, including the Wisconsin Diagnostic Breast Cancer (WDBC) dataset and the Cleveland Coronary Artery Disease (CAD) dataset, demonstrating its broad applicability and effectiveness. To address dataset imbalance, the Adaptive Synthetic Sampling (ADASYN) method is employed, followed by Particle Swarm Optimization (PSO) for feature selection. The refined datasets are then classified using a Support Vector Machine (SVM), showing that even traditional classifiers can achieve substantial improvements when enhanced with advanced sample selection. The results underscore the critical importance of precise sample selection in boosting classification performance, setting a new standard for computer-aided diagnostics and paving the way for future innovations in handling large and complex medical datasets.
Collapse
Affiliation(s)
- Samet Aymaz
- Trabzon University, Department of Computer Engineering, Trabzon, Turkiye.
| |
Collapse
|
3
|
Fan S, Abulizi A, You Y, Huang C, Yimit Y, Li Q, Zou X, Nijiati M. Predicting hospitalization costs for pulmonary tuberculosis patients based on machine learning. BMC Infect Dis 2024; 24:875. [PMID: 39198742 PMCID: PMC11360310 DOI: 10.1186/s12879-024-09771-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Accepted: 08/20/2024] [Indexed: 09/01/2024] Open
Abstract
BACKGROUND Pulmonary tuberculosis (PTB) is a prevalent chronic disease associated with a significant economic burden on patients. Using machine learning to predict hospitalization costs can allocate medical resources effectively and optimize the cost structure rationally, so as to control the hospitalization costs of patients better. METHODS This research analyzed data (2020-2022) from a Kashgar pulmonary hospital's information system, involving 9570 eligible PTB patients. SPSS 26.0 was used for multiple regression analysis, while Python 3.7 was used for random forest regression (RFR) and MLP. The training set included data from 2020 and 2021, while the test set included data from 2022. The models predicted seven various costs related to PTB patients, including diagnostic cost, medical service cost, material cost, treatment cost, drug cost, other cost, and total hospitalization cost. The model's predictive performance was evaluated using R-square (R2), Root Mean Squared Error (RMSE), and Mean Absolute Error (MAE) metrics. RESULTS Among the 9570 PTB patients included in the study, the median and quartile of total hospitalization cost were 13,150.45 (9891.34, 19,648.48) yuan. Nine factors, including age, marital status, admission condition, length of hospital stay, initial treatment, presence of other diseases, transfer, drug resistance, and admission department, significantly influenced hospitalization costs for PTB patients. Overall, MLP demonstrated superior performance in most cost predictions, outperforming RFR and multiple regression; The performance of RFR is between MLP and multiple regression; The predictive performance of multiple regression is the lowest, but it shows the best results for Other costs. CONCLUSION The MLP can effectively leverage patient information and accurately predict various hospitalization costs, achieving a rationalized structure of hospitalization costs by adjusting higher-cost inpatient items and balancing different cost categories. The insights of this predictive model also hold relevance for research in other medical conditions.
Collapse
Affiliation(s)
- Shiyu Fan
- Department of Preventive Healthcare, Shihezi University, Shihezi, 832000, China
| | - Abudoukeyoumujiang Abulizi
- Department of Radiology, The First People's Hospital of Kashi (Kashgar) Prefecture, Kashgar, 844000, China
- Xinjiang Key Laboratory of Artificial Intelligence Assisted Imaging Diagnosis, Kashgar, 844000, China
| | - Yi You
- Department of Research Collaboration, Hangzhou Deepwise & League of PHD Technology Co., Ltd, R&D Center, Hangzhou, 311101, China
| | - Chencui Huang
- Department of Research Collaboration, Hangzhou Deepwise & League of PHD Technology Co., Ltd, R&D Center, Hangzhou, 311101, China
| | - Yasen Yimit
- Department of Radiology, The First People's Hospital of Kashi (Kashgar) Prefecture, Kashgar, 844000, China
- Xinjiang Key Laboratory of Artificial Intelligence Assisted Imaging Diagnosis, Kashgar, 844000, China
| | - Qiange Li
- Department of Preventive Healthcare, Shihezi University, Shihezi, 832000, China
| | - Xiaoguang Zou
- Xinjiang Key Laboratory of Artificial Intelligence Assisted Imaging Diagnosis, Kashgar, 844000, China.
- Xinnjiang Health Commission, Urumqi, 830000, China.
| | - Mayidili Nijiati
- Xinjiang Key Laboratory of Artificial Intelligence Assisted Imaging Diagnosis, Kashgar, 844000, China.
- The Fourth Affiliated Hospital of Xinjiang Medical University, Urumqi, 830000, China.
| |
Collapse
|
4
|
Talpur F, Korejo IA, Chandio AA, Ghulam A, Talpur MSH. ML-Based Detection of DDoS Attacks Using Evolutionary Algorithms Optimization. SENSORS (BASEL, SWITZERLAND) 2024; 24:1672. [PMID: 38475208 DOI: 10.3390/s24051672] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2024] [Revised: 02/15/2024] [Accepted: 02/29/2024] [Indexed: 03/14/2024]
Abstract
The escalating reliance of modern society on information and communication technology has rendered it vulnerable to an array of cyber-attacks, with distributed denial-of-service (DDoS) attacks emerging as one of the most prevalent threats. This paper delves into the intricacies of DDoS attacks, which exploit compromised machines numbering in the thousands to disrupt data services and online commercial platforms, resulting in significant downtime and financial losses. Recognizing the gravity of this issue, various detection techniques have been explored, yet the quantity and prior detection of DDoS attacks has seen a decline in recent methods. This research introduces an innovative approach by integrating evolutionary optimization algorithms and machine learning techniques. Specifically, the study proposes XGB-GA Optimization, RF-GA Optimization, and SVM-GA Optimization methods, employing Evolutionary Algorithms (EAs) Optimization with Tree-based Pipelines Optimization Tool (TPOT)-Genetic Programming. Datasets pertaining to DDoS attacks were utilized to train machine learning models based on XGB, RF, and SVM algorithms, and 10-fold cross-validation was employed. The models were further optimized using EAs, achieving remarkable accuracy scores: 99.99% with the XGB-GA method, 99.50% with RF-GA, and 99.99% with SVM-GA. Furthermore, the study employed TPOT to identify the optimal algorithm for constructing a machine learning model, with the genetic algorithm pinpointing XGB-GA as the most effective choice. This research significantly advances the field of DDoS attack detection by presenting a robust and accurate methodology, thereby enhancing the cybersecurity landscape and fortifying digital infrastructures against these pervasive threats.
Collapse
Affiliation(s)
- Fauzia Talpur
- Institute of Mathematics & Computer Science, University of Sindh, Jamshoro 70680, Sindh, Pakistan
| | - Imtiaz Ali Korejo
- Institute of Mathematics & Computer Science, University of Sindh, Jamshoro 70680, Sindh, Pakistan
| | - Aftab Ahmed Chandio
- Institute of Mathematics & Computer Science, University of Sindh, Jamshoro 70680, Sindh, Pakistan
| | - Ali Ghulam
- Information Technology Centre, Sindh Agriculture University, Tandojam 70060, Sindh, Pakistan
| | | |
Collapse
|
5
|
Revathi T, Balasubramaniam S, Sureshkumar V, Dhanasekaran S. An Improved Long Short-Term Memory Algorithm for Cardiovascular Disease Prediction. Diagnostics (Basel) 2024; 14:239. [PMID: 38337755 PMCID: PMC10855367 DOI: 10.3390/diagnostics14030239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 01/17/2024] [Accepted: 01/21/2024] [Indexed: 02/12/2024] Open
Abstract
Cardiovascular diseases, prevalent as leading health concerns, demand early diagnosis for effective risk prevention. Despite numerous diagnostic models, challenges persist in network configuration and performance degradation, impacting model accuracy. In response, this paper introduces the Optimally Configured and Improved Long Short-Term Memory (OCI-LSTM) model as a robust solution. Leveraging the Salp Swarm Algorithm, irrelevant features are systematically eliminated, and the Genetic Algorithm is employed to optimize the LSTM's network configuration. Validation metrics, including the accuracy, sensitivity, specificity, and F1 score, affirm the model's efficacy. Comparative analysis with a Deep Neural Network and Deep Belief Network establishes the OCI-LSTM's superiority, showcasing a notable accuracy increase of 97.11%. These advancements position the OCI-LSTM as a promising model for accurate and efficient early diagnosis of cardiovascular diseases. Future research could explore real-world implementation and further refinement for seamless integration into clinical practice.
Collapse
Affiliation(s)
- T.K. Revathi
- Department of Computer Science and Engineering, Sona College of Technology, Salem 636005, India;
| | | | - Vidhushavarshini Sureshkumar
- Department of Computer Science and Engineering, Faculty of Engineering and Technology, SRM Institute of Science and Technology, Vadapalani Campus, Chennai 600026, India;
| | | |
Collapse
|
6
|
Noroozi Z, Orooji A, Erfannia L. Analyzing the impact of feature selection methods on machine learning algorithms for heart disease prediction. Sci Rep 2023; 13:22588. [PMID: 38114600 PMCID: PMC10730875 DOI: 10.1038/s41598-023-49962-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Accepted: 12/14/2023] [Indexed: 12/21/2023] Open
Abstract
The present study examines the role of feature selection methods in optimizing machine learning algorithms for predicting heart disease. The Cleveland Heart disease dataset with sixteen feature selection techniques in three categories of filter, wrapper, and evolutionary were used. Then seven algorithms Bayes net, Naïve Bayes (BN), multivariate linear model (MLM), Support Vector Machine (SVM), logit boost, j48, and Random Forest were applied to identify the best models for heart disease prediction. Precision, F-measure, Specificity, Accuracy, Sensitivity, ROC area, and PRC were measured to compare feature selection methods' effect on prediction algorithms. The results demonstrate that feature selection resulted in significant improvements in model performance in some methods (e.g., j48), whereas it led to a decrease in model performance in other models (e.g. MLP, RF). SVM-based filtering methods have a best-fit accuracy of 85.5. In fact, in a best-case scenario, filtering methods result in + 2.3 model accuracy. SVM-CFS/information gain/Symmetrical uncertainty methods have the highest improvement in this index. The filter feature selection methods with the highest number of features selected outperformed other methods in terms of models' ACC, Precision, and F-measures. However, wrapper-based and evolutionary algorithms improved models' performance from sensitivity and specificity points of view.
Collapse
Affiliation(s)
- Zeinab Noroozi
- Department of Artificial Intelligence, Islamic Azad University of Kazeroon, Kazeroon, Iran
| | - Azam Orooji
- Department of Advanced Technologies, School of Medicine, North Khorasan University of Medical Sciences (NKUMS), Bojnurd, North Khorasan, Iran
| | - Leila Erfannia
- Health Human Resources Research Center, Clinical Education Research Center, Shiraz University of Medical Sciences, Shiraz, Iran.
- Health Information Management Department, School of Health Management and Information Sciences, Shiraz University of Medical Sciences, Shiraz, Iran.
| |
Collapse
|
7
|
Doppala BP, Al Bataineh A, Vamsi B. An Efficient, Lightweight, Tiny 2D-CNN Ensemble Model to Detect Cardiomegaly in Heart CT Images. J Pers Med 2023; 13:1338. [PMID: 37763106 PMCID: PMC10532522 DOI: 10.3390/jpm13091338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 08/26/2023] [Accepted: 08/28/2023] [Indexed: 09/29/2023] Open
Abstract
Cardiomegaly is a significant global health concern, especially in developing nations. Although advanced clinical care is available for newly diagnosed patients, many in resource-limited regions face late diagnoses and consequent increased mortality. This challenge is accentuated by a scarcity of radiography equipment and radiologists. Hence, we propose the development of a computer-aided diagnostic (CAD) system, specifically a lightweight, tiny 2D-CNN ensemble model, to facilitate early detection and, potentially, reduce mortality rates. Deep learning, with its subset of convolutional neural networks (CNN), has shown potential in visual applications, especially in medical image diagnosis. However, traditional deep CNNs often face compatibility issues with object-oriented human factor technology. Our proposed model aims to bridge this gap. Using CT scan images sourced from the Mendeley data center, our tiny 2D-CNN ensemble learning model achieved an accuracy of 96.32%, offering a promising tool for efficient and accurate cardiomegaly detection.
Collapse
Affiliation(s)
| | - Ali Al Bataineh
- Artificial Intelligence Center, Norwich University, Northfield, VT 05663, USA
| | - Bandi Vamsi
- Department of Computer Science—Artificial Intelligence & Data Science, Madanapalle Institute of Technology & Science, Madanapalle 517325, India;
| |
Collapse
|
8
|
Rahimi M, Ebrahimi H. Data driven of underground water level using artificial intelligence hybrid algorithms. Sci Rep 2023; 13:10359. [PMID: 37365165 DOI: 10.1038/s41598-023-35255-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Accepted: 05/15/2023] [Indexed: 06/28/2023] Open
Abstract
As the population grows, industry and agriculture have also developed and water resources require quantitative and qualitative management. Currently, the management of water resources is essential in the exploitation and development of these resources. For this reason, it is important to study water level fluctuations to check the amount of underground water storage. It is vital to study the level of underground water in Khuzestan province with a dry climate. The methods which exist for predicting and managing water resources are used in studies according to their strengths and weaknesses and according to the conditions. In recent years, artificial intelligence has been used extensively for groundwater resources worldwide. Since artificial intelligence models have provided good results in water resources up to now, in this study, the hybrid model of three new recombined methods including FF-KNN, ABC-KNN and DL-FF-KNN-ABC-MLP has been used to predict the underground water level in Khuzestan province (Qale-Tol area). The novelty of this technique is that it first does classification by presenting the first block (combination of FF-DWKNN algorithm) and predicts with the second block (combination of ABC-MLP algorithm). The algorithm's ability to decrease data noise will be enabled by this feature. In order to predict this key and important parameter, a part of the data related to wells 1-5 has been used to build artificial intelligence hybrid models and also to test these models, and to check this model three wells 6-8 have been used for the development of these models. After checking the results, it is clear that the statistical RMSE values of this algorithm including test, train and total data are 0.0451, 0.0597 and 0.0701, respectively. According to the results presented in the table reports, the performance accuracy of DL-FF-KNN-ABC-MLP for predicting this key parameter is very high.
Collapse
Affiliation(s)
- Mohammadtaghi Rahimi
- Department of Civil Engineering, Kish international Branch, Islamic Azad University, Kish Island, Iran
| | - Hossein Ebrahimi
- Department of Water Science and Engineering, Shahr-e-Qods Branch, Islamic Azad University, Tehran, Iran.
| |
Collapse
|
9
|
Aziz MT, Mahmud SMH, Elahe MF, Jahan H, Rahman MH, Nandi D, Smirani LK, Ahmed K, Bui FM, Moni MA. A Novel Hybrid Approach for Classifying Osteosarcoma Using Deep Feature Extraction and Multilayer Perceptron. Diagnostics (Basel) 2023; 13:2106. [PMID: 37371001 DOI: 10.3390/diagnostics13122106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 06/10/2023] [Accepted: 06/13/2023] [Indexed: 06/29/2023] Open
Abstract
Osteosarcoma is the most common type of bone cancer that tends to occur in teenagers and young adults. Due to crowded context, inter-class similarity, inter-class variation, and noise in H&E-stained (hematoxylin and eosin stain) histology tissue, pathologists frequently face difficulty in osteosarcoma tumor classification. In this paper, we introduced a hybrid framework for improving the efficiency of three types of osteosarcoma tumor (nontumor, necrosis, and viable tumor) classification by merging different types of CNN-based architectures with a multilayer perceptron (MLP) algorithm on the WSI (whole slide images) dataset. We performed various kinds of preprocessing on the WSI images. Then, five pre-trained CNN models were trained with multiple parameter settings to extract insightful features via transfer learning, where convolution combined with pooling was utilized as a feature extractor. For feature selection, a decision tree-based RFE was designed to recursively eliminate less significant features to improve the model generalization performance for accurate prediction. Here, a decision tree was used as an estimator to select the different features. Finally, a modified MLP classifier was employed to classify binary and multiclass types of osteosarcoma under the five-fold CV to assess the robustness of our proposed hybrid model. Moreover, the feature selection criteria were analyzed to select the optimal one based on their execution time and accuracy. The proposed model achieved an accuracy of 95.2% for multiclass classification and 99.4% for binary classification. Experimental findings indicate that our proposed model significantly outperforms existing methods; therefore, this model could be applicable to support doctors in osteosarcoma diagnosis in clinics. In addition, our proposed model is integrated into a web application using the FastAPI web framework to provide a real-time prediction.
Collapse
Affiliation(s)
- Md Tarek Aziz
- Centre for Advanced Machine Learning and Applications (CAMLAs), Bashundhara R/A, Dhaka 1229, Bangladesh
| | - S M Hasan Mahmud
- Centre for Advanced Machine Learning and Applications (CAMLAs), Bashundhara R/A, Dhaka 1229, Bangladesh
- Department of Computer Science, American International University-Bangladesh (AIUB), 408/1, Kuratoli, Khilkhet, Dhaka 1229, Bangladesh
| | - Md Fazla Elahe
- Centre for Advanced Machine Learning and Applications (CAMLAs), Bashundhara R/A, Dhaka 1229, Bangladesh
- Department of Software Engineering, Daffodil International University, Daffodil Smart City (DSC), Savar, Dhaka 1216, Bangladesh
| | - Hosney Jahan
- Centre for Advanced Machine Learning and Applications (CAMLAs), Bashundhara R/A, Dhaka 1229, Bangladesh
- Department of Computer Science & Engineering (CSE), Military Institute of Science and Technology (MIST), Mirpur Cantonment, Dhaka 1216, Bangladesh
| | - Md Habibur Rahman
- Centre for Advanced Machine Learning and Applications (CAMLAs), Bashundhara R/A, Dhaka 1229, Bangladesh
- Department of Computer Science and Engineering, Islamic University, Kushtia 7003, Bangladesh
| | - Dip Nandi
- Department of Computer Science, American International University-Bangladesh (AIUB), 408/1, Kuratoli, Khilkhet, Dhaka 1229, Bangladesh
| | - Lassaad K Smirani
- The Deanship of Information Technology and E-learning, Umm Al-Qura University, Mecca 24382, Saudi Arabia
| | - Kawsar Ahmed
- Department of Electrical and Computer Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK S7N 5A9, Canada
- Group of Biophotomatiχ, Department of Information and Communication Technology (ICT), Mawlana Bhashani Science and Technology University (MBSTU), Tangail 1902, Bangladesh
| | - Francis M Bui
- Department of Electrical and Computer Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK S7N 5A9, Canada
| | - Mohammad Ali Moni
- Artificial Intelligence & Digital Health, School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, The University of Queensland, St. Lucia, QLD 4072, Australia
| |
Collapse
|