1
|
Abu Al-Haija Q. Editorial: Artificial intelligence solutions for decision making in robotics. Front Robot AI 2024; 11:1389191. [PMID: 38533526 PMCID: PMC10964767 DOI: 10.3389/frobt.2024.1389191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Accepted: 02/27/2024] [Indexed: 03/28/2024] Open
Affiliation(s)
- Qasem Abu Al-Haija
- Department of Cybersecurity, Faculty of Computer and Information Technology, Jordan University of Science and Technology, Irbid, Jordan
| |
Collapse
|
2
|
Ahmad A, Azzeh M, Alnagi E, Abu Al-Haija Q, Halabi D, Aref A, AbuHour Y. Hate speech detection in the Arabic language: corpus design, construction, and evaluation. Front Artif Intell 2024; 7:1345445. [PMID: 38444962 PMCID: PMC10912174 DOI: 10.3389/frai.2024.1345445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 01/25/2024] [Indexed: 03/07/2024] Open
Abstract
Hate Speech Detection in Arabic presents a multifaceted challenge due to the broad and diverse linguistic terrain. With its multiple dialects and rich cultural subtleties, Arabic requires particular measures to address hate speech online successfully. To address this issue, academics and developers have used natural language processing (NLP) methods and machine learning algorithms adapted to the complexities of Arabic text. However, many proposed methods were hampered by a lack of a comprehensive dataset/corpus of Arabic hate speech. In this research, we propose a novel multi-class public Arabic dataset comprised of 403,688 annotated tweets categorized as extremely positive, positive, neutral, or negative based on the presence of hate speech. Using our developed dataset, we additionally characterize the performance of multiple machine learning models for Hate speech identification in Arabic Jordanian dialect tweets. Specifically, the Word2Vec, TF-IDF, and AraBert text representation models have been applied to produce word vectors. With the help of these models, we can provide classification models with vectors representing text. After that, seven machine learning classifiers have been evaluated: Support Vector Machine (SVM), Logistic Regression (LR), Naive Bays (NB), Random Forest (RF), AdaBoost (Ada), XGBoost (XGB), and CatBoost (CatB). In light of this, the experimental evaluation revealed that, in this challenging and unstructured setting, our gathered and annotated datasets were rather efficient and generated encouraging assessment outcomes. This will enable academics to delve further into this crucial field of study.
Collapse
Affiliation(s)
- Ashraf Ahmad
- Department of Computer Science, Princess Sumaya University for Technology (PSUT), Amman, Jordan
| | - Mohammad Azzeh
- Department of Data Science, Princess Sumaya University for Technology (PSUT), Amman, Jordan
| | - Eman Alnagi
- Department of Computer Science, Princess Sumaya University for Technology (PSUT), Amman, Jordan
| | - Qasem Abu Al-Haija
- Department of Cybersecurity, Faculty of Computer and Information Technology, Jordan University of Science and Technology, Irbid, Jordan
| | - Dana Halabi
- SAE Institute, Luminus Technical University College (LTUC), Amman, Jordan
| | - Abdullah Aref
- Department of Computer Science, Princess Sumaya University for Technology (PSUT), Amman, Jordan
| | - Yousef AbuHour
- Department of Basic Sciences, Princess Sumaya University for Technology (PSUT), Amman, Jordan
| |
Collapse
|
3
|
Ghnemat R, Alodibat S, Abu Al-Haija Q. Explainable Artificial Intelligence (XAI) for Deep Learning Based Medical Imaging Classification. J Imaging 2023; 9:177. [PMID: 37754941 PMCID: PMC10532018 DOI: 10.3390/jimaging9090177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Revised: 08/19/2023] [Accepted: 08/23/2023] [Indexed: 09/28/2023] Open
Abstract
Recently, deep learning has gained significant attention as a noteworthy division of artificial intelligence (AI) due to its high accuracy and versatile applications. However, one of the major challenges of AI is the need for more interpretability, commonly referred to as the black-box problem. In this study, we introduce an explainable AI model for medical image classification to enhance the interpretability of the decision-making process. Our approach is based on segmenting the images to provide a better understanding of how the AI model arrives at its results. We evaluated our model on five datasets, including the COVID-19 and Pneumonia Chest X-ray dataset, Chest X-ray (COVID-19 and Pneumonia), COVID-19 Image Dataset (COVID-19, Viral Pneumonia, Normal), and COVID-19 Radiography Database. We achieved testing and validation accuracy of 90.6% on a relatively small dataset of 6432 images. Our proposed model improved accuracy and reduced time complexity, making it more practical for medical diagnosis. Our approach offers a more interpretable and transparent AI model that can enhance the accuracy and efficiency of medical diagnosis.
Collapse
Affiliation(s)
- Rawan Ghnemat
- Department of Computer Science, Princess Sumaya University for Technology, Amman 11941, Jordan
| | - Sawsan Alodibat
- Department of Computer Science, Princess Sumaya University for Technology, Amman 11941, Jordan
| | - Qasem Abu Al-Haija
- Department of Cybersecurity, Princess Sumaya University for Technology, Amman 11941, Jordan
| |
Collapse
|
4
|
Abu Al-Haija Q, Al-Fayoumi M. An intelligent identification and classification system for malicious uniform resource locators (URLs). Neural Comput Appl 2023; 35:1-17. [PMID: 37362563 PMCID: PMC10117275 DOI: 10.1007/s00521-023-08592-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 04/05/2023] [Indexed: 06/28/2023]
Abstract
Uniform Resource Locator (URL) is a unique identifier composed of protocol and domain name used to locate and retrieve a resource on the Internet. Like any Internet service, URLs (also called websites) are vulnerable to compromise by attackers to develop Malicious URLs that can exploit/devastate the user's information and resources. Malicious URLs are usually designed with the intention of promoting cyber-attacks such as spam, phishing, malware, and defacement. These websites usually require action on the user's side and can reach users across emails, text messages, pop-ups, or devious advertisements. They have a potential impact that can reach, in some cases, to compromise the machine or network of the user, especially those arriving by email. Therefore, developing systems to detect malicious URLs is of great interest nowadays. This paper proposes a high-performance machine learning-based detection system to identify Malicious URLs. The proposed system provides two layers of detection. Firstly, we identify the URLs as either benign or malware using a binary classifier. Secondly, we classify the URL classes based on their feature into five classes: benign, spam, phishing, malware, and defacement. Specifically, we report on four ensemble learning approaches, viz. the ensemble of bagging trees (En_Bag) approach, the ensemble of k-nearest neighbor (En_kNN) approach, and the ensemble of boosted decision trees (En_Bos) approach, and the ensemble of subspace discriminator (En_Dsc) approach. The developed approaches have been evaluated on an inclusive and contemporary dataset for uniform resource locators (ISCX-URL2016). ISCX-URL2016 provides a lightweight dataset for detecting and categorizing malicious URLs according to their attack type and lexical analysis. Conventional machine learning evaluation measurements are used to evaluate the detection accuracy, precision, recall, F Score, and detection time. Our experiential assessment indicates that the ensemble of bagging trees (En_Bag) approach provides better performance rates than other ensemble methods. Alternatively, the ensemble of the k-nearest neighbor (En_kNN) approach provides the highest inference speed. We also contrast our En_Bag model with state-of-the-art solutions and show its superiority in binary classification and multi-classification with accuracy rates of 99.3% and 97.92%, respectively.
Collapse
Affiliation(s)
- Qasem Abu Al-Haija
- Department of Cybersecurity, Princess Sumaya University for Technology (PSUT), Amman, Jordan
| | - Mustafa Al-Fayoumi
- Department of Cybersecurity, Princess Sumaya University for Technology (PSUT), Amman, Jordan
| |
Collapse
|
5
|
Abu Al-Haija Q, Alohaly M, Odeh A. A Lightweight Double-Stage Scheme to Identify Malicious DNS over HTTPS Traffic Using a Hybrid Learning Approach. Sensors (Basel) 2023; 23:3489. [PMID: 37050549 PMCID: PMC10098885 DOI: 10.3390/s23073489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 03/20/2023] [Accepted: 03/24/2023] [Indexed: 06/19/2023]
Abstract
The Domain Name System (DNS) protocol essentially translates domain names to IP addresses, enabling browsers to load and utilize Internet resources. Despite its major role, DNS is vulnerable to various security loopholes that attackers have continually abused. Therefore, delivering secure DNS traffic has become challenging since attackers use advanced and fast malicious information-stealing approaches. To overcome DNS vulnerabilities, the DNS over HTTPS (DoH) protocol was introduced to improve the security of the DNS protocol by encrypting the DNS traffic and communicating it over a covert network channel. This paper proposes a lightweight, double-stage scheme to identify malicious DoH traffic using a hybrid learning approach. The system comprises two layers. At the first layer, the traffic is examined using random fine trees (RF) and identified as DoH traffic or non-DoH traffic. At the second layer, the DoH traffic is further investigated using Adaboost trees (ADT) and identified as benign DoH or malicious DoH. Specifically, the proposed system is lightweight since it works with the least number of features (using only six out of thirty-three features) selected using principal component analysis (PCA) and minimizes the number of samples produced using a random under-sampling (RUS) approach. The experiential evaluation reported a high-performance system with a predictive accuracy of 99.4% and 100% and a predictive overhead of 0.83 µs and 2.27 µs for layer one and layer two, respectively. Hence, the reported results are superior and surpass existing models, given that our proposed model uses only 18% of the feature set and 17% of the sample set, distributed in balanced classes.
Collapse
Affiliation(s)
- Qasem Abu Al-Haija
- Department of Cybersecurity, Princess Sumaya University for Technology (PSUT), Amman 11941, Jordan
| | - Manar Alohaly
- Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
| | - Ammar Odeh
- Department of Computer Science, Princess Sumaya University for Technology (PSUT), Amman 11941, Jordan
| |
Collapse
|
6
|
Ibrahim RF, Abu Al-Haija Q, Ahmad A. DDoS Attack Prevention for Internet of Thing Devices Using Ethereum Blockchain Technology. Sensors (Basel) 2022; 22:6806. [PMID: 36146163 PMCID: PMC9505972 DOI: 10.3390/s22186806] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 09/06/2022] [Accepted: 09/06/2022] [Indexed: 06/16/2023]
Abstract
The Internet of Things (IoT) has widely expanded due to its advantages in enhancing the business, industrial, and social ecosystems. Nevertheless, IoT infrastructure is susceptible to several cyber-attacks due to the endpoint devices' restrictions in computation, storage, and communication capacity. As such, distributed denial-of-service (DDoS) attacks pose a serious threat to the security of the IoT. Attackers can easily utilize IoT devices as part of botnets to launch DDoS attacks by taking advantage of their flaws. This paper proposes an Ethereum blockchain model to detect and prevent DDoS attacks against IoT systems. Additionally, the proposed system can be used to resolve the single points of failure (dependencies on third parties) and privacy and security in IoT systems. First, we propose implementing a decentralized platform in place of current centralized system solutions to prevent DDoS attacks on IoT devices at the application layer by authenticating and verifying these devices. Second, we suggest tracing and recording the IP address of malicious devices inside the blockchain to prevent them from connecting and communicating with the IoT networks. The system performance has been evaluated by performing 100 experiments to evaluate the time taken by the authentication process. The proposed system highlights two messages with a time of 0.012 ms: the first is the request transmitted from the IoT follower device to join the blockchain, and the second is the blockchain response. The experimental evaluation demonstrated the superiority of our system because there are fewer I/O operations in the proposed system than in other related works, and thus it runs substantially faster.
Collapse
|
7
|
Abu Al-Haija Q. Leveraging ShuffleNet transfer learning to enhance handwritten character recognition. Gene Expr Patterns 2022; 45:119263. [PMID: 35850482 DOI: 10.1016/j.gep.2022.119263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 05/01/2022] [Accepted: 07/09/2022] [Indexed: 12/01/2022]
Abstract
Handwritten character recognition has continually been a fascinating field of study in pattern recognition due to its numerous real-life applications, such as the reading tools for blind people and the reading tools for handwritten bank cheques. Therefore, the proper and accurate conversion of handwriting into organized digital files that can be easily recognized and processed by computer algorithms is required for various applications and systems. This paper proposes an accurate and precise autonomous structure for handwriting recognition using a ShuffleNet convolutional neural network to produce a multi-class recognition for the offline handwritten characters and numbers. The developed system utilizes the transfer learning of the powerful ShuffleNet CNN to train, validate, recognize, and categorize the handwritten character/digit images dataset into 26 classes for the English characters and ten categories for the digit characters. The experimental outcomes exhibited that the proposed recognition system achieves extraordinary overall recognition accuracy peaking at 99.50% outperforming other contrasted character recognition systems reported in the state-of-art. Besides, a low computational cost has been observed for the proposed model recording an average of 2.7 (ms) for the single sample inferencing.
Collapse
Affiliation(s)
- Qasem Abu Al-Haija
- Department of Computer Science/Cybersecurity, Princess Sumaya University for Technology, Amman, Jordan.
| |
Collapse
|
8
|
Zidi S, Mihoub A, Mian Qaisar S, Krichen M, Abu Al-Haija Q. Theft detection dataset for benchmarking and machine learning based classification in a smart grid environment. Journal of King Saud University - Computer and Information Sciences 2022. [DOI: 10.1016/j.jksuci.2022.05.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
9
|
|
10
|
Abu Al-Haija Q. Top-Down Machine Learning-Based Architecture for Cyberattacks Identification and Classification in IoT Communication Networks. Front Big Data 2022; 4:782902. [PMID: 35098112 PMCID: PMC8792902 DOI: 10.3389/fdata.2021.782902] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Accepted: 12/15/2021] [Indexed: 11/28/2022] Open
Abstract
With the prompt revolution and emergence of smart, self-reliant, and low-power devices, Internet of Things (IoT) has inconceivably expanded and impacted almost every real-life application. Nowadays, for example, machines and devices are now fully reliant on computer control and, instead, they have their own programmable interfaces, such as cars, unmanned aerial vehicles (UAVs), and medical devices. With this increased use of IoT, attack capabilities have increased in response, which became imperative that new methods for securing these systems be developed to detect attacks launched against IoT devices and gateways. These attacks are usually aimed at accessing, changing, or destroying sensitive information; extorting money from users; or interrupting normal business processes. In this research, we present new efficient and generic top-down architecture for intrusion detection, and classification in IoT networks using non-traditional machine learning is proposed in this article. The proposed architecture can be customized and used for intrusion detection/classification incorporating any IoT cyber-attack datasets, such as CICIDS Dataset, MQTT dataset, and others. Specifically, the proposed system is composed of three subsystems: feature engineering (FE) subsystem, feature learning (FL) subsystem, and detection and classification (DC) subsystem. All subsystems have been thoroughly described and analyzed in this article. Accordingly, the proposed architecture employs deep learning models to enable the detection of slightly mutated attacks of IoT networking with high detection/classification accuracy for the IoT traffic obtained from either real-time system or a pre-collected dataset. Since this work employs the system engineering (SE) techniques, the machine learning technology, the cybersecurity of IoT systems field, and the collective corporation of the three fields have successfully yielded a systematic engineered system that can be implemented with high-performance trajectories.
Collapse
|
11
|
Abu Al-Haija Q, Al-Badawi A. Attack-Aware IoT Network Traffic Routing Leveraging Ensemble Learning. Sensors (Basel) 2021; 22:s22010241. [PMID: 35009784 PMCID: PMC8749547 DOI: 10.3390/s22010241] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/05/2021] [Revised: 12/26/2021] [Accepted: 12/28/2021] [Indexed: 11/21/2022]
Abstract
Network Intrusion Detection Systems (NIDSs) are indispensable defensive tools against various cyberattacks. Lightweight, multipurpose, and anomaly-based detection NIDSs employ several methods to build profiles for normal and malicious behaviors. In this paper, we design, implement, and evaluate the performance of machine-learning-based NIDS in IoT networks. Specifically, we study six supervised learning methods that belong to three different classes: (1) ensemble methods, (2) neural network methods, and (3) kernel methods. To evaluate the developed NIDSs, we use the distilled-Kitsune-2018 and NSL-KDD datasets, both consisting of a contemporary real-world IoT network traffic subjected to different network attacks. Standard performance evaluation metrics from the machine-learning literature are used to evaluate the identification accuracy, error rates, and inference speed. Our empirical analysis indicates that ensemble methods provide better accuracy and lower error rates compared with neural network and kernel methods. On the other hand, neural network methods provide the highest inference speed which proves their suitability for high-bandwidth networks. We also provide a comparison with state-of-the-art solutions and show that our best results are better than any prior art by 1~20%.
Collapse
Affiliation(s)
- Qasem Abu Al-Haija
- Department of Computer Science/Cybersecurity, Princess Sumaya University for Technology, Amman 11941, Jordan
- Correspondence:
| | - Ahmad Al-Badawi
- Department of Homeland Security, Rabdan Academy (RA), Abu Dhabi 22401, United Arab Emirates;
| |
Collapse
|
12
|
Abstract
In this paper, we present a forecasting scheme for the growth of molecular structures from NMR and X-ray Crystallography experimental techniques released every year by employing an autoregressive (AR) process. The proposed scheme maximises the forecasting accuracy by utilising the optimal AR process order. The optimal model order was derived as the model with the least prediction error. Therefore, the proposed scheme has been efficiently employed to model and predict the annual growth of structures-based NMR and X-ray Crystallography experimental data for the next decade 2019–2028 using the time series of the past 43 years of both experimental datasets. The experimental results showed that the optimal model order to estimate both datasets was [Formula: see text] which belongs to a forecasting accuracy of [Formula: see text], for both datasets. Indeed, such a high level of accuracy referred to the amount of linearity between the consecutive elements of the original times series. Hence, the forecasting results reveals of an exponential increasing behaviour in the future growth in the annual structures released from both NMR and X-ray Crystallography experiments.
Collapse
Affiliation(s)
- Kamal Al Nasr
- Department of Computer Science, Tennessee State University, Nashville, TN, USA
- University of Texas, San Antonio, TX, USA
| | - Qasem Abu Al-Haija
- Department of Computer and Information, Systems Engineering (CISE), Tennessee State University, Nashville, TN, USA
| |
Collapse
|
13
|
Jebril NA, Al-Zoubi HR, Abu Al-Haija Q. Recognition of Handwritten Arabic Characters using Histograms of Oriented Gradient (HOG). Pattern Recognit Image Anal 2018. [DOI: 10.1134/s1054661818020141] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|