1
|
Hammad M, Chelloug SA, Alayed W, El-Latif AAA. Optimizing Multimodal Scene Recognition through Mutual Information-Based Feature Selection in Deep Learning Models. APPLIED SCIENCES 2023; 13:11829. [DOI: 10.3390/app132111829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/23/2024]
Abstract
The field of scene recognition, which lies at the crossroads of computer vision and artificial intelligence, has experienced notable progress because of scholarly pursuits. This article introduces a novel methodology for scene recognition by combining convolutional neural networks (CNNs) with feature selection techniques based on mutual information (MI). The main goal of our study is to address the limitations inherent in conventional unimodal methods, with the aim of improving the precision and dependability of scene classification. The focus of our research is around the formulation of a comprehensive approach for scene detection, utilizing multimodal deep learning methodologies implemented on a solitary input image. Our work distinguishes itself by the innovative amalgamation of CNN- and MI-based feature selection. This integration provides distinct advantages and enhanced capabilities when compared to prevailing methodologies. In order to assess the effectiveness of our methodology, we performed tests on two openly accessible datasets, namely, the scene categorization dataset and the AID dataset. The results of these studies exhibited notable levels of precision, with accuracies of 100% and 98.83% achieved for the corresponding datasets. These findings surpass the performance of other established techniques. The primary objective of our end-to-end approach is to reduce complexity and resource requirements, hence creating a robust framework for the task of scene categorization. This work significantly advances the practical application of computer vision in various real-world scenarios, leading to a large improvement in the accuracy of scene recognition and interpretation.
Collapse
Affiliation(s)
- Mohamed Hammad
- EIAS Data Science Lab, College of Computer and Information Sciences, Prince Sultan University, Riyadh 11586, Saudi Arabia
- Department of Information Technology, Faculty of Computers and Information, Menoufia University, Shebin El Kom 32511, Egypt
| | - Samia Allaoua Chelloug
- Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh 11671, Saudi Arabia
| | - Walaa Alayed
- Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh 11671, Saudi Arabia
| | - Ahmed A. Abd El-Latif
- EIAS Data Science Lab, College of Computer and Information Sciences, Prince Sultan University, Riyadh 11586, Saudi Arabia
- Faculty of Informatics and Computing, Universiti Sultan Zainal Abidin (UniSZA), Besut Campus, Besut 22200, Malaysia
- Department of Mathematics and Computer Science, Faculty of Science, Menoufia University, Shebin Elkom 32511, Egypt
| |
Collapse
|
2
|
Improved U-Net Remote Sensing Classification Algorithm Fusing Attention and Multiscale Features. REMOTE SENSING 2022. [DOI: 10.3390/rs14153591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
The selection and representation of classification features in remote sensing image play crucial roles in image classification accuracy. To effectively improve the features classification accuracy, an improved U-Net remote sensing classification algorithm fusing attention and multiscale features is proposed in this paper, called spatial attention-atrous spatial pyramid pooling U-Net (SA-UNet). This framework connects atrous spatial pyramid pooling (ASPP) with the convolutional units of the encoder of the original U-Net in the form of residuals. The ASPP module expands the receptive field, integrates multiscale features in the network, and enhances the ability to express shallow features. Through the fusion residual module, shallow and deep features are deeply fused, and the characteristics of shallow and deep features are further used. The spatial attention mechanism is used to combine spatial with semantic information so that the decoder can recover more spatial information. In this study, the crop distribution in central Guangxi province was analyzed, and experiments were conducted based on Landsat 8 multispectral remote sensing images. The experimental results showed that the improved algorithm increases the classification accuracy, with the accuracy increasing from 93.33% to 96.25%, The segmentation accuracy of sugarcane, rice, and other land increased from 96.42%, 63.37%, and 88.43% to 98.01%, 83.21%, and 95.71%, respectively. The agricultural planting area results obtained by the proposed algorithm can be used as input data for regional ecological models, which is conducive to the development of accurate and real-time crop growth change models.
Collapse
|
3
|
Tian Y, Zhao X, Huang W. Meta-learning approaches for learning-to-learn in deep learning: A survey. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.04.078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
4
|
Experimental Study on Wound Area Measurement with Mobile Devices. SENSORS 2021; 21:s21175762. [PMID: 34502653 PMCID: PMC8433956 DOI: 10.3390/s21175762] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Revised: 08/25/2021] [Accepted: 08/25/2021] [Indexed: 01/26/2023]
Abstract
Healthcare treatments might benefit from advances in artificial intelligence and technological equipment such as smartphones and smartwatches. The presence of cameras in these devices with increasingly robust and precise pattern recognition techniques can facilitate the estimation of the wound area and other telemedicine measurements. Currently, telemedicine is vital to the maintenance of the quality of the treatments remotely. This study proposes a method for measuring the wound area with mobile devices. The proposed approach relies on a multi-step process consisting of image capture, conversion to grayscale, blurring, application of a threshold with segmentation, identification of the wound part, dilation and erosion of the detected wound section, identification of accurate data related to the image, and measurement of the wound area. The proposed method was implemented with the OpenCV framework. Thus, it is a solution for healthcare systems by which to investigate and treat people with skin-related diseases. The proof-of-concept was performed with a static dataset of camera images on a desktop computer. After we validated the approach’s feasibility, we implemented the method in a mobile application that allows for communication between patients, caregivers, and healthcare professionals.
Collapse
|
5
|
Wang C, Wu Y, Wang Y, Chen Y. Scene Recognition Using Deep Softpool Capsule Network Based on Residual Diverse Branch Block. SENSORS (BASEL, SWITZERLAND) 2021; 21:5575. [PMID: 34451017 PMCID: PMC8402264 DOI: 10.3390/s21165575] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 07/29/2021] [Accepted: 08/12/2021] [Indexed: 12/02/2022]
Abstract
With the improvement of the quality and resolution of remote sensing (RS) images, scene recognition tasks have played an important role in the RS community. However, due to the special bird's eye view image acquisition mode of imaging sensors, it is still challenging to construct a discriminate representation of diverse and complex scenes to improve RS image recognition performance. Capsule networks that can learn the spatial relationship between the features in an image has a good image classification performance. However, the original capsule network is not suitable for images with a complex background. To address the above issues, this paper proposes a novel end-to-end capsule network termed DS-CapsNet, in which a new multi-scale feature enhancement module and a new Caps-SoftPool method are advanced by aggregating the advantageous attributes of the residual convolution architecture, Diverse Branch Block (DBB), Squeeze and Excitation (SE) block, and the Caps-SoftPool method. By using the residual DBB, multiscale features can be extracted and fused to recover a semantic strong feature representation. By adopting SE, the informative features are emphasized, and the less salient features are weakened. The new Caps-SoftPool method can reduce the number of parameters that are needed in order to prevent an over-fitting problem. The novel DS-CapsNet achieves a competitive and promising performance for RS image recognition by using high-quality and robust capsule representation. The extensive experiments on two challenging datasets, AID and NWPU-RESISC45, demonstrate the robustness and superiority of the proposed DS-CapsNet in scene recognition tasks.
Collapse
Affiliation(s)
- Chunyuan Wang
- School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China; (C.W.); (Y.C.)
| | - Yang Wu
- Shanghai Institute of Satellite Engineering, Shanghai 200240, China;
| | - Yihan Wang
- School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China; (C.W.); (Y.C.)
| | - Yiping Chen
- School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China; (C.W.); (Y.C.)
| |
Collapse
|
6
|
A Systematic Investigation of Models for Color Image Processing in Wound Size Estimation. COMPUTERS 2021. [DOI: 10.3390/computers10040043] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
In recent years, research in tracking and assessing wound severity using computerized image processing has increased. With the emergence of mobile devices, powerful functionalities and processing capabilities have provided multiple non-invasive wound evaluation opportunities in both clinical and non-clinical settings. With current imaging technologies, objective and reliable techniques provide qualitative information that can be further processed to provide quantitative information on the size, structure, and color characteristics of wounds. These efficient image analysis algorithms help determine the injury features and the progress of healing in a short time. This paper presents a systematic investigation of articles that specifically address the measurement of wounds’ sizes with image processing techniques, promoting the connection between computer science and health. Of the 208 studies identified by searching electronic databases, 20 were included in the review. From the perspective of image processing color models, the most dominant model was the hue, saturation, and value (HSV) color space. We proposed that a method for measuring the wound area must implement different stages, including conversion to grayscale for further implementation of the threshold and a segmentation method to measure the wound area as the number of pixels for further conversion to metric units. Regarding devices, mobile technology is shown to have reached the level of reliable accuracy.
Collapse
|
7
|
Multi-Horizon Air Pollution Forecasting with Deep Neural Networks. SENSORS 2021; 21:s21041235. [PMID: 33578633 PMCID: PMC7916344 DOI: 10.3390/s21041235] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Revised: 01/26/2021] [Accepted: 01/27/2021] [Indexed: 11/18/2022]
Abstract
Air pollution is a global problem, especially in urban areas where the population density is very high due to the diverse pollutant sources such as vehicles, industrial plants, buildings, and waste. North Macedonia, as a developing country, has a serious problem with air pollution. The problem is highly present in its capital city, Skopje, where air pollution places it consistently within the top 10 cities in the world during the winter months. In this work, we propose using Recurrent Neural Network (RNN) models with long short-term memory units to predict the level of PM10 particles at 6, 12, and 24 h in the future. We employ historical air quality measurement data from sensors placed at multiple locations in Skopje and meteorological conditions such as temperature and humidity. We compare different deep learning models’ performance to an Auto-regressive Integrated Moving Average (ARIMA) model. The obtained results show that the proposed models consistently outperform the baseline model and can be successfully employed for air pollution prediction. Ultimately, we demonstrate that these models can help decision-makers and local authorities better manage the air pollution consequences by taking proactive measures.
Collapse
|
8
|
Modules and Techniques for Motion Planning: An Industrial Perspective. SENSORS 2021; 21:s21020420. [PMID: 33435294 PMCID: PMC7826951 DOI: 10.3390/s21020420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Revised: 12/31/2020] [Accepted: 01/05/2021] [Indexed: 11/16/2022]
Abstract
Research on autonomous cars has become one of the main research paths in the automotive industry, with many critical issues that remain to be explored while considering the overall methodology and its practical applicability. In this paper, we present an industrial experience in which we build a complete autonomous driving system, from the sensor units to the car control equipment, and we describe its adoption and testing phase on the field. We report how we organize data fusion and map manipulation to represent the required reality. We focus on the communication and synchronization issues between the data-fusion device and the path-planner, between the CPU and the GPU units, and among different CUDA kernels implementing the core local planner module. In these frameworks, we propose simple representation strategies and approximation techniques which guarantee almost no penalty in terms of accuracy and large savings in terms of memory occupation and memory transfer times. We show how we adopt a recent implementation on parallel many-core devices, such as CUDA-based GPGPU, to reduce the computational burden of rapidly exploring random trees to explore the state space along with a given reference path. We report on our use of the controller and the vehicle simulator. We run experiments on several real scenarios, and we report the paths generated with the different settings, with their relative errors and computation times. We prove that our approach can generate reasonable paths on a multitude of standard maneuvers in real time.
Collapse
|
9
|
Abstract
Distracted driving behavior has become a leading cause of vehicle crashes. This paper proposes a data augmentation method for distracted driving detection based on the driving operation area. First, the class activation mapping method is used to show the key feature areas of driving behavior analysis, and then the driving operation areas are detected by the faster R-CNN detection model for data augmentation. Finally, the convolutional neural network classification mode is implemented and evaluated to detect the original dataset and the driving operation area dataset. The classification result achieves a 96.97% accuracy using the distracted driving dataset. The results show the necessity of driving operation area extraction in the preprocessing stage, which can effectively remove the redundant information in the images to get a higher classification accuracy rate. The method of this research can be used to detect drivers in actual application scenarios to identify dangerous driving behaviors, which helps to give early warning of unsafe driving behaviors and avoid accidents.
Collapse
|
10
|
Air Pollution Prediction with Multi-Modal Data and Deep Neural Networks. REMOTE SENSING 2020. [DOI: 10.3390/rs12244142] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Air pollution is becoming a rising and serious environmental problem, especially in urban areas affected by an increasing migration rate. The large availability of sensor data enables the adoption of analytical tools to provide decision support capabilities. Employing sensors facilitates air pollution monitoring, but the lack of predictive capability limits such systems’ potential in practical scenarios. On the other hand, forecasting methods offer the opportunity to predict the future pollution in specific areas, potentially suggesting useful preventive measures. To date, many works tackled the problem of air pollution forecasting, most of which are based on sequence models. These models are trained with raw pollution data and are subsequently utilized to make predictions. This paper proposes a novel approach evaluating four different architectures that utilize camera images to estimate the air pollution in those areas. These images are further enhanced with weather data to boost the classification accuracy. The proposed approach exploits generative adversarial networks combined with data augmentation techniques to mitigate the class imbalance problem. The experiments show that the proposed method achieves robust accuracy of up to 0.88, which is comparable to sequence models and conventional models that utilize air pollution data. This is a remarkable result considering that the historic air pollution data is directly related to the output—future air pollution data, whereas the proposed architecture uses camera images to recognize the air pollution—which is an inherently much more difficult problem.
Collapse
|
11
|
When Self-Supervised Learning Meets Scene Classification: Remote Sensing Scene Classification Based on a Multitask Learning Framework. REMOTE SENSING 2020. [DOI: 10.3390/rs12203276] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In recent years, the development of convolutional neural networks (CNNs) has promoted continuous progress in scene classification of remote sensing images. Compared with natural image datasets, however, the acquisition of remote sensing scene images is more difficult, and consequently the scale of remote sensing image datasets is generally small. In addition, many problems related to small objects and complex backgrounds arise in remote sensing image scenes, presenting great challenges for CNN-based recognition methods. In this article, to improve the feature extraction ability and generalization ability of such models and to enable better use of the information contained in the original remote sensing images, we introduce a multitask learning framework which combines the tasks of self-supervised learning and scene classification. Unlike previous multitask methods, we adopt a new mixup loss strategy to combine the two tasks with dynamic weight. The proposed multitask learning framework empowers a deep neural network to learn more discriminative features without increasing the amounts of parameters. Comprehensive experiments were conducted on four representative remote sensing scene classification datasets. We achieved state-of-the-art performance, with average accuracies of 94.21%, 96.89%, 99.11%, and 98.98% on the NWPU, AID, UC Merced, and WHU-RS19 datasets, respectively. The experimental results and visualizations show that our proposed method can learn more discriminative features and simultaneously encode orientation information while effectively improving the accuracy of remote sensing scene classification.
Collapse
|
12
|
Improving Human Activity Monitoring by Imputation of Missing Sensory Data: Experimental Study. FUTURE INTERNET 2020. [DOI: 10.3390/fi12090155] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
The automatic recognition of human activities with sensors available in off-the-shelf mobile devices has been the subject of different research studies in recent years. It may be useful for the monitoring of elderly people to present warning situations, monitoring the activity of sports people, and other possibilities. However, the acquisition of the data from different sensors may fail for different reasons, and the human activities are recognized with better accuracy if the different datasets are fulfilled. This paper focused on two stages of a system for the recognition of human activities: data imputation and data classification. Regarding the data imputation, a methodology for extrapolating the missing samples of a dataset to better recognize the human activities was proposed. The K-Nearest Neighbors (KNN) imputation technique was used to extrapolate the missing samples in dataset captures. Regarding the data classification, the accuracy of the previously implemented method, i.e., Deep Neural Networks (DNN) with normalized and non-normalized data, was improved in relation to the previous results without data imputation.
Collapse
|