1
|
Ding J, Hu J, Lin J, Zhang X. Lightweight enhanced YOLOv8n underwater object detection network for low light environments. Sci Rep 2024; 14:27922. [PMID: 39537721 PMCID: PMC11561321 DOI: 10.1038/s41598-024-79211-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2024] [Accepted: 11/06/2024] [Indexed: 11/16/2024] Open
Abstract
In response to the challenges of target misidentification, missed detection, and other issues arising from severe light attenuation, low visibility, and complex environments in current underwater target detection, we propose a lightweight low-light underwater target detection network, named PDSC-YOLOv8n. Firstly, we enhance the input images using the improved Pro MSRCR algorithm for data augmentation. Secondly, we replace the traditional convolutions in the backbone and neck networks of YOLOv8n with Ghost and GSConv modules respectively to achieve lightweight network modeling. Additionally, we integrate the improved DCNv3 module into the C2f module of the backbone network to enhance the capability of target feature extraction. Furthermore, we introduce the GAM into the SPPF and incorporate the CBAM attention mechanism into the last layer of the backbone network to enhance the model's perceptual and generalization capabilities. Finally, we optimize the training process of the model using WIoUv3 as the loss function. The model is successfully deployed on an embedded platform, achieving real-time detection of underwater targets on the embedded platform. We first conduct experiments on the RUOD underwater dataset. Meanwhile, we also utilized the Pascal VOC2012 dataset to evaluate our approach. The mAP@0.5 and mAP@0.5:0.95 of the original YOLOv8n algorithm on RUOD dataset were 79.6% and 58.2%, respectively, and the PDSC -YOLOv8n algorithm mAP@0.5 and mAP@0.5:0.95 can reach 86.1% and 60.8%. The number of parameters of the model is reduced by about 15.5%, the detection accuracy is improved by 6.5%. The original YOLOv8n algorithm was 73% and 53.2% mAP@0.5 and mAP@0.5:0.95 on the Pascal VOC dataset, respectively. The PDSC-YOLOv8n algorithm mAP@0.5 and mAP@0.5:0.95 were 78.5% and 57%, respectively. The superior performance of PDSC-YOLOv8n indicates its effectiveness in the field of underwater target detection.
Collapse
Affiliation(s)
- Jifeng Ding
- College of Information and Communication Engineering, Dalian Minzu University, Dalian, 116600, China.
| | - Junquan Hu
- College of Information and Communication Engineering, Dalian Minzu University, Dalian, 116600, China
| | - Jiayuan Lin
- College of Information and Communication Engineering, Dalian Minzu University, Dalian, 116600, China
| | - Xiaotong Zhang
- College of Information and Communication Engineering, Dalian Minzu University, Dalian, 116600, China
| |
Collapse
|
2
|
Xiang J, Yang Y, Bai J. Adaptive classification of artistic images using multi-scale convolutional neural networks. PeerJ Comput Sci 2024; 10:e2336. [PMID: 39650500 PMCID: PMC11622833 DOI: 10.7717/peerj-cs.2336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Accepted: 08/27/2024] [Indexed: 12/11/2024]
Abstract
The current art image classification methods have low recall and accuracy rate issues . To improve the classification performance of art images, a new adaptive classification method is designed employing multi-scale convolutional neural networks (CNNs). Firstly, the multi-scale Retinex algorithm with color recovery is used to complete the enhancement processing of art images. Then the extreme pixel ratio is utilized to evaluate the image quality and obtain the art image that can be analyzed. Afterward, edge detection technology is implemented to extract the key features in the image and use them as initial values of the item to be trained in the classification model. Finally, a multi-scale convolutional neural network (CNN) is constructed by using extended convolutions, and the characteristics of each level network are set. The decision fusion method based on maximum output probability is employed to calculate different subclassifies' probabilities and determine the final category of an input image to realize the art image adaptive classification. The experimental results show that the proposed method can effectively improve the recall rate and precision rate of art images and obtain reliable image classification results.
Collapse
Affiliation(s)
- Jin Xiang
- School of Art and Design, Wuhan Polytechnic University, Wuhan, China
| | - Yi Yang
- School of Industrial Design, Hubei Institute of Fine Arts, Hubei, Wuhan, China
| | - Junwei Bai
- School of Art and Design, Wuhan Polytechnic University, Wuhan, China
| |
Collapse
|
3
|
Wang Y, Liu P, Li D, Wang K, Zhang R. An Image Histogram Equalization Acceleration Method for Field-Programmable Gate Arrays Based on a Two-Dimensional Configurable Pipeline. SENSORS (BASEL, SWITZERLAND) 2024; 24:280. [PMID: 38203143 PMCID: PMC10781339 DOI: 10.3390/s24010280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 12/17/2023] [Accepted: 12/19/2023] [Indexed: 01/12/2024]
Abstract
New artificial intelligence scenarios, such as high-precision online industrial detection, unmanned driving, etc., are constantly emerging and have resulted in an increasing demand for real-time image processing with high frame rates and low power consumption. Histogram equalization (HE) is a very effective and commonly used image preprocessing algorithm designed to improve the quality of image processing results. However, most existing HE acceleration methods, whether run on general-purpose CPUs or dedicated embedded systems, require further improvement in their frame rate to meet the needs of more complex scenarios. In this paper, we propose an HE acceleration method for FPGAs based on a two-dimensional configurable pipeline architecture. We first optimize the parallelizability of HE with a fully configurable two-dimensional pipeline architecture according to the principle of adapting the algorithm to the hardware, where one dimension can compute the cumulative histogram in parallel and the other dimension can process multiple inputs simultaneously. This optimization also helps in the construction of a simple architecture that achieves a higher frequency when implementing HE on FPGAs, which consist of configurable input units, calculation units, and output units. Finally, we optimize the pipeline and critical path of the calculation units. In the experiments, we deploy the optimized HE on a VCU118 test board and achieve a maximum frequency of 891 MHz (which is up to 22.6 times more acceleration than CPU implementations), as well as a frame rate of 1899 frames per second for 1080p images.
Collapse
Affiliation(s)
- Yan Wang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (Y.W.); (P.L.)
| | - Peirui Liu
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (Y.W.); (P.L.)
| | - Dalin Li
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (Y.W.); (P.L.)
- School of Computer Science, Zhuhai College of Science and Technology, Zhuhai 519041, China
| | - Kangping Wang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (Y.W.); (P.L.)
| | - Rui Zhang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (Y.W.); (P.L.)
| |
Collapse
|
4
|
Yao J, Liu J, Zhang Y, Wang H. Identification of winter wheat pests and diseases based on improved convolutional neural network. Open Life Sci 2023; 18:20220632. [PMID: 37426620 PMCID: PMC10329273 DOI: 10.1515/biol-2022-0632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 04/07/2023] [Accepted: 05/17/2023] [Indexed: 07/11/2023] Open
Abstract
Wheat pests and diseases are one of the main factors affecting wheat yield. According to the characteristics of four common pests and diseases, an identification method based on improved convolution neural network is proposed. VGGNet16 is selected as the basic network model, but the problem of small dataset size is common in specific fields such as smart agriculture, which limits the research and application of artificial intelligence methods based on deep learning technology in the field. Data expansion and transfer learning technology are introduced to improve the training mode, and then attention mechanism is introduced for further improvement. The experimental results show that the transfer learning scheme of fine-tuning source model is better than that of freezing source model, and the VGGNet16 based on fine-tuning all layers has the best recognition effect, with an accuracy of 96.02%. The CBAM-VGGNet16 and NLCBAM-VGGNet16 models are designed and implemented. The experimental results show that the recognition accuracy of the test set of CBAM-VGGNet16 and NLCBAM-VGGNet16 is higher than that of VGGNet16. The recognition accuracy of CBAM-VGGNet16 and NLCBAM-VGGNet16 is 96.60 and 97.57%, respectively, achieving high precision recognition of common pests and diseases of winter wheat.
Collapse
Affiliation(s)
- Jianbin Yao
- School of Information Engineering, North China University of Water Resources and Electric Power, Zhengzhou, 450046, China
| | - Jianhua Liu
- School of Information Engineering, North China University of Water Resources and Electric Power, Zhengzhou, 450046, China
| | - Yingna Zhang
- School of Information Engineering, North China University of Water Resources and Electric Power, Zhengzhou, 450046, China
| | - Hansheng Wang
- School of Information Engineering, North China University of Water Resources and Electric Power, Zhengzhou, 450046, China
| |
Collapse
|
5
|
Wang N, Chen T, Liu S, Wang R, Karimi HR, Lin Y. Deep Learning-based Visual Detection of Marine Organisms: A Survey. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.02.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/19/2023]
|
6
|
Xu S, Zhang M, Song W, Mei H, He Q, Liotta A. A Systematic Review and Analysis of Deep Learning-based Underwater Object Detection. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.01.056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|
7
|
Abstract
In brain–computer interfaces (BCIs), it is crucial to process brain signals to improve the accuracy of the classification of motor movements. Machine learning (ML) algorithms such as artificial neural networks (ANNs), linear discriminant analysis (LDA), decision tree (D.T.), K-nearest neighbor (KNN), naive Bayes (N.B.), and support vector machine (SVM) have made significant progress in classification issues. This paper aims to present a signal processing analysis of electroencephalographic (EEG) signals among different feature extraction techniques to train selected classification algorithms to classify signals related to motor movements. The motor movements considered are related to the left hand, right hand, both fists, feet, and relaxation, making this a multiclass problem. In this study, nine ML algorithms were trained with a dataset created by the feature extraction of EEG signals.The EEG signals of 30 Physionet subjects were used to create a dataset related to movement. We used electrodes C3, C1, CZ, C2, and C4 according to the standard 10-10 placement. Then, we extracted the epochs of the EEG signals and applied tone, amplitude levels, and statistical techniques to obtain the set of features. LabVIEW™2015 version custom applications were used for reading the EEG signals; for channel selection, noise filtering, band selection, and feature extraction operations; and for creating the dataset. MATLAB 2021a was used for training, testing, and evaluating the performance metrics of the ML algorithms. In this study, the model of Medium-ANN achieved the best performance, with an AUC average of 0.9998, Cohen’s Kappa coefficient of 0.9552, a Matthews correlation coefficient of 0.9819, and a loss of 0.0147. These findings suggest the applicability of our approach to different scenarios, such as implementing robotic prostheses, where the use of superficial features is an acceptable option when resources are limited, as in embedded systems or edge computing devices.
Collapse
|