1
|
Huang W, Tian H, Wang S, Zhang C, Zhang X. Integration of simulated annealing into pigeon inspired optimizer algorithm for feature selection in network intrusion detection systems. PeerJ Comput Sci 2024; 10:e2176. [PMID: 39145221 PMCID: PMC11322994 DOI: 10.7717/peerj-cs.2176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Accepted: 06/12/2024] [Indexed: 08/16/2024]
Abstract
In the context of the 5G network, the proliferation of access devices results in heightened network traffic and shifts in traffic patterns, and network intrusion detection faces greater challenges. A feature selection algorithm is proposed for network intrusion detection systems that uses an improved binary pigeon-inspired optimizer (SABPIO) algorithm to tackle the challenges posed by the high dimensionality and complexity of network traffic, resulting in complex models, reduced accuracy, and longer detection times. First, the raw dataset is pre-processed by uniquely one-hot encoded and standardized. Next, feature selection is performed using SABPIO, which employs simulated annealing and the population decay factor to identify the most relevant subset of features for subsequent review and evaluation. Finally, the selected subset of features is fed into decision trees and random forest classifiers to evaluate the effectiveness of SABPIO. The proposed algorithm has been validated through experimentation on three publicly available datasets: UNSW-NB15, NLS-KDD, and CIC-IDS-2017. The experimental findings demonstrate that SABPIO identifies the most indicative subset of features through rational computation. This method significantly abbreviates the system's training duration, enhances detection rates, and compared to the use of all features, minimally reduces the training and testing times by factors of 3.2 and 0.3, respectively. Furthermore, it enhances the F1-score of the feature subset selected by CPIO and Boost algorithms when compared to CPIO and XGBoost, resulting in improvements ranging from 1.21% to 2.19%, and 1.79% to 4.52%.
Collapse
Affiliation(s)
- Wanwei Huang
- College of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou, Henan, China
| | - Haobin Tian
- College of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou, Henan, China
| | - Sunan Wang
- Electronic & Communication Engineering, Shenzhen Polytechnic School, Shenzhen, Guangdong, China
| | - Chaoqin Zhang
- College of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou, Henan, China
| | - Xiaohui Zhang
- Henan Xinda Wangyu Technology Co. Ltd, Zhengzhou, Henan, China
| |
Collapse
|
2
|
Musthafa MB, Huda S, Kodera Y, Ali MA, Araki S, Mwaura J, Nogami Y. Optimizing IoT Intrusion Detection Using Balanced Class Distribution, Feature Selection, and Ensemble Machine Learning Techniques. SENSORS (BASEL, SWITZERLAND) 2024; 24:4293. [PMID: 39001072 PMCID: PMC11244377 DOI: 10.3390/s24134293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Revised: 06/26/2024] [Accepted: 06/27/2024] [Indexed: 07/16/2024]
Abstract
Internet of Things (IoT) devices are leading to advancements in innovation, efficiency, and sustainability across various industries. However, as the number of connected IoT devices increases, the risk of intrusion becomes a major concern in IoT security. To prevent intrusions, it is crucial to implement intrusion detection systems (IDSs) that can detect and prevent such attacks. IDSs are a critical component of cybersecurity infrastructure. They are designed to detect and respond to malicious activities within a network or system. Traditional IDS methods rely on predefined signatures or rules to identify known threats, but these techniques may struggle to detect novel or sophisticated attacks. The implementation of IDSs with machine learning (ML) and deep learning (DL) techniques has been proposed to improve IDSs' ability to detect attacks. This will enhance overall cybersecurity posture and resilience. However, ML and DL techniques face several issues that may impact the models' performance and effectiveness, such as overfitting and the effects of unimportant features on finding meaningful patterns. To ensure better performance and reliability of machine learning models in IDSs when dealing with new and unseen threats, the models need to be optimized. This can be done by addressing overfitting and implementing feature selection. In this paper, we propose a scheme to optimize IoT intrusion detection by using class balancing and feature selection for preprocessing. We evaluated the experiment on the UNSW-NB15 dataset and the NSL-KD dataset by implementing two different ensemble models: one using a support vector machine (SVM) with bagging and another using long short-term memory (LSTM) with stacking. The results of the performance and the confusion matrix show that the LSTM stacking with analysis of variance (ANOVA) feature selection model is a superior model for classifying network attacks. It has remarkable accuracies of 96.92% and 99.77% and overfitting values of 0.33% and 0.04% on the two datasets, respectively. The model's ROC is also shaped with a sharp bend, with AUC values of 0.9665 and 0.9971 for the UNSW-NB15 dataset and the NSL-KD dataset, respectively.
Collapse
Affiliation(s)
- Muhammad Bisri Musthafa
- Graduate School of Environmental, Life, Natural Science and Technology, Okayama University, Okayama 700-8530, Japan
| | - Samsul Huda
- Green Innovation Center, Okayama University, Okayama 700-8530, Japan
| | - Yuta Kodera
- Graduate School of Environmental, Life, Natural Science and Technology, Okayama University, Okayama 700-8530, Japan
| | - Md Arshad Ali
- Faculty of CSE, Hajee Mohammad Danesh Science and Technology University, Dinajpur 5200, Bangladesh
| | - Shunsuke Araki
- Graduate School of Computer Science and Systems Engineering, Kyushu Institute of Technology, Fukuoka 804-8550, Japan
| | - Jedidah Mwaura
- Graduate School of Computer Science and Systems Engineering, Kyushu Institute of Technology, Fukuoka 804-8550, Japan
| | - Yasuyuki Nogami
- Graduate School of Environmental, Life, Natural Science and Technology, Okayama University, Okayama 700-8530, Japan
| |
Collapse
|
3
|
Salman EH, Taher MA, Hammadi YI, Mahmood OA, Muthanna A, Koucheryavy A. An Anomaly Intrusion Detection for High-Density Internet of Things Wireless Communication Network Based Deep Learning Algorithms. SENSORS (BASEL, SWITZERLAND) 2022; 23:s23010206. [PMID: 36616806 DOI: 10.3390/electronics11203332] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Revised: 12/10/2022] [Accepted: 12/22/2022] [Indexed: 05/27/2023]
Abstract
Telecommunication networks are growing exponentially due to their significant role in civilization and industry. As a result of this very significant role, diverse applications have been appeared, which require secured links for data transmission. However, Internet-of-Things (IoT) devices are a substantial field that utilizes the wireless communication infrastructure. However, the IoT, besides the diversity of communications, are more vulnerable to attacks due to the physical distribution in real world. Attackers may prevent the services from running or even forward all of the critical data across the network. That is, an Intrusion Detection System (IDS) has to be integrated into the communication networks. In the literature, there are numerous methodologies to implement the IDSs. In this paper, two distinct models are proposed. In the first model, a custom Convolutional Neural Network (CNN) was constructed and combined with Long Short Term Memory (LSTM) deep network layers. The second model was built about the all fully connected layers (dense layers) to construct an Artificial Neural Network (ANN). Thus, the second model, which is a custom of an ANN layers with various dimensions, is proposed. Results were outstanding a compared to the Logistic Regression algorithm (LR), where an accuracy of 97.01% was obtained in the second model and 96.08% in the first model, compared to the LR algorithm, which showed an accuracy of 92.8%.
Collapse
Affiliation(s)
- Emad Hmood Salman
- Department of Communications Engineering, College of Engineering, University of Diyala, Baquba 32001, Iraq
| | - Montadar Abas Taher
- Department of Communications Engineering, College of Engineering, University of Diyala, Baquba 32001, Iraq
| | - Yousif I Hammadi
- Department of Medical Instruments Engineering Techniques, Bilad Alrafidain University College, Diyala 32001, Iraq
| | - Omar Abdulkareem Mahmood
- Department of Communications Engineering, College of Engineering, University of Diyala, Baquba 32001, Iraq
| | - Ammar Muthanna
- Department of Telecommunication Networks and Data Transmission, The Bonch-Bruevich Saint-Petersburg State University of Telecommunications, 193232 Saint Petersburg, Russia
| | - Andrey Koucheryavy
- Department of Telecommunication Networks and Data Transmission, The Bonch-Bruevich Saint-Petersburg State University of Telecommunications, 193232 Saint Petersburg, Russia
| |
Collapse
|
4
|
Toward Efficient Intrusion Detection System Using Hybrid Deep Learning Approach. Symmetry (Basel) 2022. [DOI: 10.3390/sym14091916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
The increased adoption of cloud computing resources produces major loopholes in cloud computing for cybersecurity attacks. An intrusion detection system (IDS) is one of the vital defenses against threats and attacks to cloud computing. Current IDSs encounter two challenges, namely, low accuracy and a high false alarm rate. Due to these challenges, additional efforts are required by network experts to respond to abnormal traffic alerts. To improve IDS efficiency in detecting abnormal network traffic, this work develops an IDS using a recurrent neural network based on gated recurrent units (GRUs) and improved long short-term memory (LSTM) through a computing unit to form Cu-LSTMGRU. The proposed system efficiently classifies the network flow instances as benign or malevolent. This system is examined using the most up-to-date dataset CICIDS2018. To further optimize computational complexity, the dataset is optimized through the Pearson correlation feature selection algorithm. The proposed model is evaluated using several metrics. The results show that the proposed model remarkably outperforms benchmarks by up to 12.045%. Therefore, the Cu-LSTMGRU model provides a high level of symmetry between cloud computing security and the detection of intrusions and malicious attacks.
Collapse
|
5
|
Zeng H, Jiang S, Cui T, Lu Z, Li J, Lee BG, Zhu J, Yang X. ScatterHough: Automatic Lane Detection from Noisy LiDAR Data. SENSORS 2022; 22:s22145424. [PMID: 35891101 PMCID: PMC9319445 DOI: 10.3390/s22145424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Revised: 07/05/2022] [Accepted: 07/18/2022] [Indexed: 12/02/2022]
Abstract
Lane detection plays an essential role in autonomous driving. Using LiDAR data instead of RGB images makes lane detection a simple straight line, and curve fitting problem works for realtime applications even under poor weather or lighting conditions. Handling scatter distributed noisy data is a crucial step to reduce lane detection error from LiDAR data. Classic Hough Transform (HT) only allows points in a straight line to vote on the corresponding parameters, which is not suitable for data in scatter form. In this paper, a Scatter Hough algorithm is proposed for better lane detection on scatter data. Two additional operations, ρ neighbor voting and ρ neighbor vote-reduction, are introduced to HT to make points in the same curve vote and consider their neighbors’ voting result as well. The evaluation of the proposed method shows that this method can adaptively fit both straight lines and curves with high accuracy, compared with benchmark and state-of-the-art methods.
Collapse
Affiliation(s)
- Honghao Zeng
- School of Computer Science, University of Nottingham Ningbo China, Ningbo 315100, China; (H.Z.); (Z.L.); (J.L.); (B.-G.L.); (J.Z.); (X.Y.)
| | - Shihong Jiang
- Huawei Technologies Co., Ltd., Shanghai 201206, China;
| | - Tianxiang Cui
- School of Computer Science, University of Nottingham Ningbo China, Ningbo 315100, China; (H.Z.); (Z.L.); (J.L.); (B.-G.L.); (J.Z.); (X.Y.)
- Correspondence:
| | - Zheng Lu
- School of Computer Science, University of Nottingham Ningbo China, Ningbo 315100, China; (H.Z.); (Z.L.); (J.L.); (B.-G.L.); (J.Z.); (X.Y.)
| | - Jiawei Li
- School of Computer Science, University of Nottingham Ningbo China, Ningbo 315100, China; (H.Z.); (Z.L.); (J.L.); (B.-G.L.); (J.Z.); (X.Y.)
| | - Boon-Giin Lee
- School of Computer Science, University of Nottingham Ningbo China, Ningbo 315100, China; (H.Z.); (Z.L.); (J.L.); (B.-G.L.); (J.Z.); (X.Y.)
| | - Junsong Zhu
- School of Computer Science, University of Nottingham Ningbo China, Ningbo 315100, China; (H.Z.); (Z.L.); (J.L.); (B.-G.L.); (J.Z.); (X.Y.)
| | - Xiaoying Yang
- School of Computer Science, University of Nottingham Ningbo China, Ningbo 315100, China; (H.Z.); (Z.L.); (J.L.); (B.-G.L.); (J.Z.); (X.Y.)
| |
Collapse
|
6
|
Advanced Feature-Selection-Based Hybrid Ensemble Learning Algorithms for Network Intrusion Detection Systems. Symmetry (Basel) 2022. [DOI: 10.3390/sym14071461] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
As cyber-attacks become remarkably sophisticated, effective Intrusion Detection Systems (IDSs) are needed to monitor computer resources and to provide alerts regarding unusual or suspicious behavior. Despite using several machine learning (ML) and data mining methods to achieve high effectiveness, these systems have not proven ideal. Current intrusion detection algorithms suffer from high dimensionality, redundancy, meaningless data, high error rate, false alarm rate, and false-negative rate. This paper proposes a novel Ensemble Learning (EL) algorithm-based network IDS model. The efficient feature selection is attained via a hybrid of Correlation Feature Selection coupled with Forest Panelized Attributes (CFS–FPA). The improved intrusion detection involves exploiting AdaBoosting and bagging ensemble learning algorithms to modify four classifiers: Support Vector Machine, Random Forest, Naïve Bayes, and K-Nearest Neighbor. These four enhanced classifiers have been applied first as AdaBoosting and then as bagging, using the aggregation technique through the voting average technique. To provide better benchmarking, both binary and multi-class classification forms are used to evaluate the model. The experimental results of applying the model to CICIDS2017 dataset achieved promising results of 99.7%accuracy, a 0.053 false-negative rate, and a 0.004 false alarm rate. This system will be effective for information technology-based organizations, as it is expected to provide a high level of symmetry between information security and detection of attacks and malicious intrusion.
Collapse
|
7
|
Jaw E, Wang X. A novel hybrid-based approach of snort automatic rule generator and security event correlation (SARG-SEC). PeerJ Comput Sci 2022; 8:e900. [PMID: 35494802 PMCID: PMC9044335 DOI: 10.7717/peerj-cs.900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 02/01/2022] [Indexed: 06/14/2023]
Abstract
The rapid advanced technological development alongside the Internet with its cutting-edge applications has positively impacted human society in many aspects. Nevertheless, it equally comes with the escalating privacy and critical cybersecurity concerns that can lead to catastrophic consequences, such as overwhelming the current network security frameworks. Consequently, both the industry and academia have been tirelessly harnessing various approaches to design, implement and deploy intrusion detection systems (IDSs) with event correlation frameworks to help mitigate some of these contemporary challenges. There are two common types of IDS: signature and anomaly-based IDS. Signature-based IDS, specifically, Snort works on the concepts of rules. However, the conventional way of creating Snort rules can be very costly and error-prone. Also, the massively generated alerts from heterogeneous anomaly-based IDSs is a significant research challenge yet to be addressed. Therefore, this paper proposed a novel Snort Automatic Rule Generator (SARG) that exploits the network packet contents to automatically generate efficient and reliable Snort rules with less human intervention. Furthermore, we evaluated the effectiveness and reliability of the generated Snort rules, which produced promising results. In addition, this paper proposed a novel Security Event Correlator (SEC) that effectively accepts raw events (alerts) without prior knowledge and produces a much more manageable set of alerts for easy analysis and interpretation. As a result, alleviating the massive false alarm rate (FAR) challenges of existing IDSs. Lastly, we have performed a series of experiments to test the proposed systems. It is evident from the experimental results that SARG-SEC has demonstrated impressive performance and could significantly mitigate the existing challenges of dealing with the vast generated alerts and the labor-intensive creation of Snort rules.
Collapse
Affiliation(s)
- Ebrima Jaw
- College of Computer Science and Technology, Guizhou University, Guiyang, Guizhou, China
- School of Information Technology and Communication, University of The Gambia (UTG), Banjul, Peace Building, Kanifing, The Gambia
| | - Xueming Wang
- College of Computer Science and Technology, Guizhou University, Guiyang, Guizhou, China
| |
Collapse
|
8
|
TCAN-IDS: Intrusion Detection System for Internet of Vehicle Using Temporal Convolutional Attention Network. Symmetry (Basel) 2022. [DOI: 10.3390/sym14020310] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Intrusion detection systems based on recurrent neural network (RNN) have been considered as one of the effective methods to detect time-series data of in-vehicle networks. However, building a model for each arbitration bit is not only complex in structure but also has high computational overhead. Convolutional neural network (CNN) has always performed excellently in processing images, but they have recently shown great performance in learning features of normal and attack traffic by constructing message matrices in such a manner as to achieve real-time monitoring but suffer from the problem of temporal relationships in context and inadequate feature representation in key regions. Therefore, this paper proposes a temporal convolutional network with global attention to construct an in-vehicle network intrusion detection model, called TCAN-IDS. Specifically, the TCAN-IDS model continuously encodes 19-bit features consisting of an arbitration bit and data field of the original message into a message matrix, which is symmetric to messages recalling a historical moment. Thereafter, the feature extraction model extracts its spatial-temporal detail features. Notably, global attention enables global critical region attention based on channel and spatial feature coefficients, thus ignoring unimportant byte changes. Finally, anomalous traffic is monitored by a two-class classification component. Experiments show that TCAN-IDS demonstrates high detection performance on publicly known attack datasets and is able to accomplish real-time monitoring. In particular, it is anticipated to provide a high level of symmetry between information security and illegal intrusion.
Collapse
|
9
|
Detection of Username Enumeration Attack on SSH Protocol: Machine Learning Approach. Symmetry (Basel) 2021. [DOI: 10.3390/sym13112192] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Over the last two decades (2000–2020), the Internet has rapidly evolved, resulting in symmetrical and asymmetrical Internet consumption patterns and billions of users worldwide. With the immense rise of the Internet, attacks and malicious behaviors pose a huge threat to our computing environment. Brute-force attack is among the most prominent and commonly used attacks, achieved out using password-attack tools, a wordlist dictionary, and a usernames list—obtained through a so-called an enumeration attack. In this paper, we investigate username enumeration attack detection on SSH protocol by using machine-learning classifiers. We apply four asymmetrical classifiers on our generated dataset collected from a closed-environment network to build machine-learning-based models for attack detection. The use of several machine-learners offers a wider investigation spectrum of the classifiers’ ability in attack detection. Additionally, we investigate how beneficial it is to include or exclude network ports information as features-set in the process of learning. We evaluated and compared the performances of machine-learning models for both cases. The models used are k-nearest neighbor (K-NN), naïve Bayes (NB), random forest (RF) and decision tree (DT) with and without ports information. Our results show that machine-learning approaches to detect SSH username enumeration attacks were quite successful, with KNN having an accuracy of 99.93%, NB 95.70%, RF 99.92%, and DT 99.88%. Furthermore, the results improve when using ports information.
Collapse
|