1
|
Wang Z, Ghaleb FA, Zainal A, Siraj MM, Lu X. An efficient intrusion detection model based on convolutional spiking neural network. Sci Rep 2024; 14:7054. [PMID: 38528084 DOI: 10.1038/s41598-024-57691-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Accepted: 03/20/2024] [Indexed: 03/27/2024] Open
Abstract
Many intrusion detection techniques have been developed to ensure that the target system can function properly under the established rules. With the booming Internet of Things (IoT) applications, the resource-constrained nature of its devices makes it urgent to explore lightweight and high-performance intrusion detection models. Recent years have seen a particularly active application of deep learning (DL) techniques. The spiking neural network (SNN), a type of artificial intelligence that is associated with sparse computations and inherent temporal dynamics, has been viewed as a potential candidate for the next generation of DL. It should be noted, however, that current research into SNNs has largely focused on scenarios where limited computational resources and insufficient power sources are not considered. Consequently, even state-of-the-art SNN solutions tend to be inefficient. In this paper, a lightweight and effective detection model is proposed. With the help of rational algorithm design, the model integrates the advantages of SNNs as well as convolutional neural networks (CNNs). In addition to reducing resource usage, it maintains a high level of classification accuracy. The proposed model was evaluated against some current state-of-the-art models using a comprehensive set of metrics. Based on the experimental results, the model demonstrated improved adaptability to environments with limited computational resources and energy sources.
Collapse
Affiliation(s)
- Zhen Wang
- Faculty of Computing, Universiti Teknologi Malaysia, Johor Bahru, 81310, Johor, Malaysia
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou, 325035, Zhejiang, China
| | - Fuad A Ghaleb
- College of Computing and Digital Technology, Birmingham City University, Birmingham, B47XG, United Kingdom
| | - Anazida Zainal
- Faculty of Computing, Universiti Teknologi Malaysia, Johor Bahru, 81310, Johor, Malaysia
| | - Maheyzah Md Siraj
- Faculty of Computing, Universiti Teknologi Malaysia, Johor Bahru, 81310, Johor, Malaysia
| | - Xing Lu
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou, 325035, Zhejiang, China.
| |
Collapse
|
2
|
Ashkouti F, Khamforoosh K. A distributed computing model for big data anonymization in the networks. PLoS One 2023; 18:e0285212. [PMID: 37115783 PMCID: PMC10146481 DOI: 10.1371/journal.pone.0285212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 04/18/2023] [Indexed: 04/29/2023] Open
Abstract
Recently big data and its applications had sharp growth in various fields such as IoT, bioinformatics, eCommerce, and social media. The huge volume of data incurred enormous challenges to the architecture, infrastructure, and computing capacity of IT systems. Therefore, the compelling need of the scientific and industrial community is large-scale and robust computing systems. Since one of the characteristics of big data is value, data should be published for analysts to extract useful patterns from them. However, data publishing may lead to the disclosure of individuals' private information. Among the modern parallel computing platforms, Apache Spark is a fast and in-memory computing framework for large-scale data processing that provides high scalability by introducing the resilient distributed dataset (RDDs). In terms of performance, Due to in-memory computations, it is 100 times faster than Hadoop. Therefore, Apache Spark is one of the essential frameworks to implement distributed methods for privacy-preserving in big data publishing (PPBDP). This paper uses the RDD programming of Apache Spark to propose an efficient parallel implementation of a new computing model for big data anonymization. This computing model has three-phase of in-memory computations to address the runtime, scalability, and performance of large-scale data anonymization. The model supports partition-based data clustering algorithms to preserve the λ-diversity privacy model by using transformation and actions on RDDs. Therefore, the authors have investigated Spark-based implementation for preserving the λ-diversity privacy model by two designed City block and Pearson distance functions. The results of the paper provide a comprehensive guideline allowing the researchers to apply Apache Spark in their own researches.
Collapse
Affiliation(s)
- Farough Ashkouti
- Department of Computer Engineering, Mahabad Branch, Islamic Azad University, Mahabad, Iran
| | - Keyhan Khamforoosh
- Department of Computer Engineering, Sanandaj Branch, Islamic Azad University, Sanandaj, Iran
| |
Collapse
|
3
|
Bagui S, Mink D, Bagui S, Ghosh T, McElroy T, Paredes E, Khasnavis N, Plenkers R. Detecting Reconnaissance and Discovery Tactics from the MITRE ATT&CK Framework in Zeek Conn Logs Using Spark's Machine Learning in the Big Data Framework. SENSORS (BASEL, SWITZERLAND) 2022; 22:7999. [PMID: 36298351 PMCID: PMC9610873 DOI: 10.3390/s22207999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 10/17/2022] [Accepted: 10/18/2022] [Indexed: 06/16/2023]
Abstract
While computer networks and the massive amount of communication taking place on these networks grow, the amount of damage that can be done by network intrusions grows in tandem. The need is for an effective and scalable intrusion detection system (IDS) to address these potential damages that come with the growth of these networks. A great deal of contemporary research on near real-time IDS focuses on applying machine learning classifiers to labeled network intrusion datasets, but these datasets need be relevant pertaining to the currency of the network intrusions. This paper focuses on a newly created dataset, UWF-ZeekData22, that analyzes data from Zeek's Connection Logs collected using Security Onion 2 network security monitor and labelled using the MITRE ATT&CK framework TTPs. Due to the volume of data, Spark, in the big data framework, was used to run many of the well-known classifiers (naïve Bayes, random forest, decision tree, support vector classifier, gradient boosted trees, and logistic regression) to classify the reconnaissance and discovery tactics from this dataset. In addition to looking at the performance of these classifiers using Spark, scalability and response time were also analyzed.
Collapse
Affiliation(s)
- Sikha Bagui
- Department of Computer Science, University of West Florida, Pensacola, FL 32514, USA
| | - Dustin Mink
- Department of Computer Science, University of West Florida, Pensacola, FL 32514, USA
| | - Subhash Bagui
- Department of Mathematics and Statistics, University of West Florida, Pensacola, FL 32514, USA
| | - Tirthankar Ghosh
- Department of Computer Science, University of West Florida, Pensacola, FL 32514, USA
| | - Tom McElroy
- Department of Computer Science, University of West Florida, Pensacola, FL 32514, USA
| | - Esteban Paredes
- Department of Computer Science, University of West Florida, Pensacola, FL 32514, USA
| | - Nithisha Khasnavis
- Department of Computer Science, University of West Florida, Pensacola, FL 32514, USA
| | - Russell Plenkers
- Department of Computer Science, University of West Florida, Pensacola, FL 32514, USA
| |
Collapse
|
4
|
Manzano Sanchez RA, Zaman M, Goel N, Naik K, Joshi R. Towards Developing a Robust Intrusion Detection Model Using Hadoop-Spark and Data Augmentation for IoT Networks. SENSORS (BASEL, SWITZERLAND) 2022; 22:7726. [PMID: 36298077 PMCID: PMC9608938 DOI: 10.3390/s22207726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 10/04/2022] [Accepted: 10/06/2022] [Indexed: 06/16/2023]
Abstract
In recent years, anomaly detection and machine learning for intrusion detection systems have been used to detect anomalies on Internet of Things networks. These systems rely on machine and deep learning to improve the detection accuracy. However, the robustness of the model depends on the number of datasamples available, quality of the data, and the distribution of the data classes. In the present paper, we focused specifically on the amount of data and class imbalanced since both parameters are key in IoT due to the fact that network traffic is increasing exponentially. For this reason, we propose a framework that uses a big data methodology with Hadoop-Spark to train and test multi-class and binary classification with one-vs-rest strategy for intrusion detection using the entire BoT IoT dataset. Thus, we evaluate all the algorithms available in Hadoop-Spark in terms of accuracy and processing time. In addition, since the BoT IoT dataset used is highly imbalanced, we also improve the accuracy for detecting minority classes by generating more datasamples using a Conditional Tabular Generative Adversarial Network (CTGAN). In general, our proposed model outperforms other published models including our previous model. Using our proposed methodology, the F1-score of one of the minority class, i.e., Theft attack was improved from 42% to 99%.
Collapse
Affiliation(s)
| | - Marzia Zaman
- Cistel Technology Inc., 30 Concourse Gate, Nepean, ON K2E 7V7, Canada
| | - Nishith Goel
- Cistech Limited, 201-203 Colonnade Rd, Nepean, ON K2E 7K3, Canada
| | - Kshirasagar Naik
- Department of Electrical and Computer Engineering, University of Waterloo, 200 University Ave W, Waterloo, ON N2L 3G1, Canada
| | - Rohit Joshi
- Cistel Technology Inc., 30 Concourse Gate, Nepean, ON K2E 7V7, Canada
| |
Collapse
|
5
|
An intellectual intrusion detection system using Hybrid Hunger Games Search and Remora Optimization Algorithm for IoT wireless networks. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
6
|
Zhang C, Jia D, Wang L, Wang W, Liu F, Yang A. Comparative Research on Network Intrusion Detection Methods Based on Machine Learning. Comput Secur 2022. [DOI: 10.1016/j.cose.2022.102861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
7
|
Fuzzy Local Information and Bhattacharya-Based C-Means Clustering and Optimized Deep Learning in Spark Framework for Intrusion Detection. ELECTRONICS 2022. [DOI: 10.3390/electronics11111675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Strong network connections make the risk of malicious activities emerge faster while dealing with big data. An intrusion detection system (IDS) can be utilized for alerting suitable entities when hazardous actions are occurring. Most of the techniques used to classify intrusions lack the techniques executed with big data. This paper devised an optimization-driven deep learning technique for detecting the intrusion using the Spark model. The input data is fed to the data partitioning phase wherein the partitioning of data is done using the proposed fuzzy local information and Bhattacharya-based C-means (FLIBCM). The proposed FLIBCM was devised by combining Bhattacharya distance and fuzzy local information C-Means (FLICM). The feature selection was achieved with classwise info gained to select imperative features. The data augmentation was done with oversampling to make it apposite for further processing. The detection of intrusion was done using a deep Maxout network (DMN), which was trained using the proposed student psychology water cycle caviar (SPWCC) obtained by combining the water cycle algorithm (WCA), the conditional autoregressive value at risk by regression quantiles (CAViaR), and the student psychology-based optimization algorithm (SPBO). The proposed SPWCC-based DMN offered enhanced performance with the highest accuracy of 97.6%, sensitivity of 98%, and specificity of 97%.
Collapse
|
8
|
Learning-Based Methods for Cyber Attacks Detection in IoT Systems: Methods, Analysis, and Future Prospects. ELECTRONICS 2022. [DOI: 10.3390/electronics11091502] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
Internet of Things (IoT) is a developing technology that provides the simplicity and benefits of exchanging data with other devices using the cloud or wireless networks. However, the changes and developments in the IoT environment are making IoT systems susceptible to cyber attacks which could possibly lead to malicious intrusions. The impacts of these intrusions could lead to physical and economical damages. This article primarily focuses on the IoT system/framework, the IoT, learning-based methods, and the difficulties faced by the IoT devices or systems after the occurrence of an attack. Learning-based methods are reviewed using different types of cyber attacks, such as denial-of-service (DoS), distributed denial-of-service (DDoS), probing, user-to-root (U2R), remote-to-local (R2L), botnet attack, spoofing, and man-in-the-middle (MITM) attacks. For learning-based methods, both machine and deep learning methods are presented and analyzed in relation to the detection of cyber attacks in IoT systems. A comprehensive list of publications to date in the literature is integrated to present a complete picture of various developments in this area. Finally, future research directions are also provided in the paper.
Collapse
|
9
|
Abstract
Emerging applications of IoT (the Internet of Things), such as smart transportation, health, and energy, are envisioned to greatly enhance the societal infrastructure and quality of life of individuals. In such innovative IoT applications, cost-efficient real-time decision-making is critical to facilitate, for example, effective transportation management and healthcare. In this paper, we formally define real-time decision tasks in IoT, review cutting-edge approaches that aim to efficiently schedule real-time decision tasks to meet their timing and data freshness constraints, review state-of-the-art approaches for efficient sensor data analytics in IoT, and discuss future research directions.
Collapse
|
10
|
Najafimehr M, Zarifzadeh S, Mostafavi S. A hybrid machine learning approach for detecting unprecedented DDoS attacks. THE JOURNAL OF SUPERCOMPUTING 2022; 78:8106-8136. [PMID: 35017789 PMCID: PMC8739683 DOI: 10.1007/s11227-021-04253-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 12/16/2021] [Indexed: 06/14/2023]
Abstract
Service availability plays a vital role on computer networks, against which Distributed Denial of Service (DDoS) attacks are an increasingly growing threat each year. Machine learning (ML) is a promising approach widely used for DDoS detection, which obtains satisfactory results for pre-known attacks. However, they are almost incapable of detecting unknown malicious traffic. This paper proposes a novel method combining both supervised and unsupervised algorithms. First, a clustering algorithm separates the anomalous traffic from the normal data using several flow-based features. Then, using certain statistical measures, a classification algorithm is used to label the clusters. Employing a big data processing framework, we evaluate the proposed method by training on the CICIDS2017 dataset and testing on a different set of attacks provided in the more up-to-date CICDDoS2019. The results demonstrate that the Positive Likelihood Ratio (LR+) of our method is approximately 198% higher than the ML classification algorithms.
Collapse
|
11
|
A Novel Approach for Network Intrusion Detection Using Multistage Deep Learning Image Recognition. ELECTRONICS 2021. [DOI: 10.3390/electronics10151854] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The current rise in hacking and computer network attacks throughout the world has heightened the demand for improved intrusion detection and prevention solutions. The intrusion detection system (IDS) is critical in identifying abnormalities and assaults on the network, which have grown in size and pervasiveness. The paper proposes a novel approach for network intrusion detection using multistage deep learning image recognition. The network features are transformed into four-channel (Red, Green, Blue, and Alpha) images. The images then are used for classification to train and test the pre-trained deep learning model ResNet50. The proposed approach is evaluated using two publicly available benchmark datasets, UNSW-NB15 and BOUN Ddos. On the UNSW-NB15 dataset, the proposed approach achieves 99.8% accuracy in the detection of the generic attack. On the BOUN DDos dataset, the suggested approach achieves 99.7% accuracy in the detection of the DDos attack and 99.7% accuracy in the detection of the normal traffic.
Collapse
|
12
|
Abstract
In this era of big data, the amount of video content has dramatically increased with an exponential broadening of video streaming services. Hence, it has become very strenuous for end-users to search for their desired videos. Therefore, to attain an accurate and robust clustering of information, a hybrid algorithm was used to introduce a recommender engine with collaborative filtering using Apache Spark and machine learning (ML) libraries. In this study, we implemented a movie recommendation system based on a collaborative filtering approach using the alternating least squared (ALS) model to predict the best-rated movies. Our proposed system uses the last search data of a user regarding movie category and references this to instruct the recommender engine, thereby making a list of predictions for top ratings. The proposed study used a model-based approach of matrix factorization, the ALS algorithm along with a collaborative filtering technique, which solved the cold start, sparse, and scalability problems. In particular, we performed experimental analysis and successfully obtained minimum root mean squared errors (oRMSEs) of 0.8959 to 0.97613, approximately. Moreover, our proposed movie recommendation system showed an accuracy of 97% and predicted the top 1000 ratings for movies.
Collapse
|
13
|
Siboni S, Cohen A. Anomaly Detection for Individual Sequences with Applications in Identifying Malicious Tools. ENTROPY 2020; 22:e22060649. [PMID: 33286421 PMCID: PMC7517183 DOI: 10.3390/e22060649] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2020] [Revised: 06/03/2020] [Accepted: 06/08/2020] [Indexed: 11/25/2022]
Abstract
Anomaly detection refers to the problem of identifying abnormal behaviour within a set of measurements. In many cases, one has some statistical model for normal data, and wishes to identify whether new data fit the model or not. However, in others, while there are normal data to learn from, there is no statistical model for this data, and there is no structured parameter set to estimate. Thus, one is forced to assume an individual sequences setup, where there is no given model or any guarantee that such a model exists. In this work, we propose a universal anomaly detection algorithm for one-dimensional time series that is able to learn the normal behaviour of systems and alert for abnormalities, without assuming anything on the normal data, or anything on the anomalies. The suggested method utilizes new information measures that were derived from the Lempel–Ziv (LZ) compression algorithm in order to optimally and efficiently learn the normal behaviour (during learning), and then estimate the likelihood of new data (during operation) and classify it accordingly. We apply the algorithm to key problems in computer security, as well as a benchmark anomaly detection data set, all using simple, single-feature time-indexed data. The first is detecting Botnets Command and Control (C&C) channels without deep inspection. We then apply it to the problems of malicious tools detection via system calls monitoring and data leakage identification.We conclude with the New York City (NYC) taxi data. Finally, while using information theoretic tools, we show that an attacker’s attempt to maliciously fool the detection system by trying to generate normal data is bound to fail, either due to a high probability of error or because of the need for huge amounts of resources.
Collapse
Affiliation(s)
- Shachar Siboni
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva 8410501, Israel
- Correspondence: (S.S.); (A.C.); Tel.: +972-50-2560998 (S.S.); +972-50-2054477 (A.C.)
| | - Asaf Cohen
- School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer-Sheva 8410501, Israel
- Correspondence: (S.S.); (A.C.); Tel.: +972-50-2560998 (S.S.); +972-50-2054477 (A.C.)
| |
Collapse
|