1
|
Anari PY, Lay N, Zahergivar A, Firouzabadi FD, Chaurasia A, Golagha M, Singh S, Homayounieh F, Obiezu F, Harmon S, Turkbey E, Merino M, Jones EC, Ball MW, Linehan WM, Turkbey B, Malayeri AA. Deep learning algorithm ( YOLOv7) for automated renal mass detection on contrast-enhanced MRI: a 2D and 2.5D evaluation of results. Abdom Radiol (NY) 2024; 49:1194-1201. [PMID: 38368481 DOI: 10.1007/s00261-023-04172-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 12/17/2023] [Accepted: 12/19/2023] [Indexed: 02/19/2024]
Abstract
INTRODUCTION Accurate diagnosis and treatment of kidney tumors greatly benefit from automated solutions for detection and classification on MRI. In this study, we explore the application of a deep learning algorithm, YOLOv7, for detecting kidney tumors on contrast-enhanced MRI. MATERIAL AND METHODS We assessed the performance of YOLOv7 tumor detection on excretory phase MRIs in a large institutional cohort of patients with RCC. Tumors were segmented on MRI using ITK-SNAP and converted to bounding boxes. The cohort was randomly divided into ten benchmarks for training and testing the YOLOv7 algorithm. The model was evaluated using both 2-dimensional and a novel in-house developed 2.5-dimensional approach. Performance measures included F1, Positive Predictive Value (PPV), Sensitivity, F1 curve, PPV-Sensitivity curve, Intersection over Union (IoU), and mean average PPV (mAP). RESULTS A total of 326 patients with 1034 tumors with 7 different pathologies were analyzed across ten benchmarks. The average 2D evaluation results were as follows: Positive Predictive Value (PPV) of 0.69 ± 0.05, sensitivity of 0.39 ± 0.02, and F1 score of 0.43 ± 0.03. For the 2.5D evaluation, the average results included a PPV of 0.72 ± 0.06, sensitivity of 0.61 ± 0.06, and F1 score of 0.66 ± 0.04. The best model performance demonstrated a 2.5D PPV of 0.75, sensitivity of 0.69, and F1 score of 0.72. CONCLUSION Using computer vision for tumor identification is a cutting-edge and rapidly expanding subject. In this work, we showed that YOLOv7 can be utilized in the detection of kidney cancers.
Collapse
Affiliation(s)
- Pouria Yazdian Anari
- Radiology and Imaging Sciences, Clinical Center,, National Institutes of Health, 10 Center Drive, 1C352, Bethesda, MD, 20892, USA
| | - Nathan Lay
- Artificial Intelligence Resource, National Institutes of Health, Bethesda, USA
| | - Aryan Zahergivar
- Radiology and Imaging Sciences, Clinical Center,, National Institutes of Health, 10 Center Drive, 1C352, Bethesda, MD, 20892, USA
| | - Fatemeh Dehghani Firouzabadi
- Radiology and Imaging Sciences, Clinical Center,, National Institutes of Health, 10 Center Drive, 1C352, Bethesda, MD, 20892, USA
| | - Aditi Chaurasia
- Urology Oncology Branch, National Cancer Institutes, National Institutes of Health, Bethesda, USA
| | - Mahshid Golagha
- Radiology and Imaging Sciences, Clinical Center,, National Institutes of Health, 10 Center Drive, 1C352, Bethesda, MD, 20892, USA
| | - Shiva Singh
- Radiology and Imaging Sciences, Clinical Center,, National Institutes of Health, 10 Center Drive, 1C352, Bethesda, MD, 20892, USA
| | | | - Fiona Obiezu
- Radiology and Imaging Sciences, Clinical Center,, National Institutes of Health, 10 Center Drive, 1C352, Bethesda, MD, 20892, USA
| | - Stephanie Harmon
- Artificial Intelligence Resource, National Institutes of Health, Bethesda, USA
| | - Evrim Turkbey
- Radiology and Imaging Sciences, Clinical Center,, National Institutes of Health, 10 Center Drive, 1C352, Bethesda, MD, 20892, USA
| | - Maria Merino
- Pathology Department, National Cancer Institutes, National Institutes of Health, Bethesda, USA
| | - Elizabeth C Jones
- Radiology and Imaging Sciences, Clinical Center,, National Institutes of Health, 10 Center Drive, 1C352, Bethesda, MD, 20892, USA
| | - Mark W Ball
- Urology Oncology Branch, National Cancer Institutes, National Institutes of Health, Bethesda, USA
| | - W Marston Linehan
- Urology Oncology Branch, National Cancer Institutes, National Institutes of Health, Bethesda, USA
| | - Baris Turkbey
- Artificial Intelligence Resource, National Institutes of Health, Bethesda, USA
| | - Ashkan A Malayeri
- Radiology and Imaging Sciences, Clinical Center,, National Institutes of Health, 10 Center Drive, 1C352, Bethesda, MD, 20892, USA.
| |
Collapse
|
2
|
Jiang J, Liu H, He L, Pei M, Lin T, Yang H, Yang J, Gong J, Wei X, Zhu M, Wu G, Li Z. HM_ADET: a hybrid model for automatic detection of eyelid tumors based on photographic images. Biomed Eng Online 2024; 23:25. [PMID: 38419078 PMCID: PMC10903075 DOI: 10.1186/s12938-024-01221-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 02/15/2024] [Indexed: 03/02/2024] Open
Abstract
BACKGROUND The accurate detection of eyelid tumors is essential for effective treatment, but it can be challenging due to small and unevenly distributed lesions surrounded by irrelevant noise. Moreover, early symptoms of eyelid tumors are atypical, and some categories of eyelid tumors exhibit similar color and texture features, making it difficult to distinguish between benign and malignant eyelid tumors, particularly for ophthalmologists with limited clinical experience. METHODS We propose a hybrid model, HM_ADET, for automatic detection of eyelid tumors, including YOLOv7_CNFG to locate eyelid tumors and vision transformer (ViT) to classify benign and malignant eyelid tumors. First, the ConvNeXt module with an inverted bottleneck layer in the backbone of YOLOv7_CNFG is employed to prevent information loss of small eyelid tumors. Then, the flexible rectified linear unit (FReLU) is applied to capture multi-scale features such as texture, edge, and shape, thereby improving the localization accuracy of eyelid tumors. In addition, considering the geometric center and area difference between the predicted box (PB) and the ground truth box (GT), the GIoU_loss was utilized to handle cases of eyelid tumors with varying shapes and irregular boundaries. Finally, the multi-head attention (MHA) module is applied in ViT to extract discriminative features of eyelid tumors for benign and malignant classification. RESULTS Experimental results demonstrate that the HM_ADET model achieves excellent performance in the detection of eyelid tumors. In specific, YOLOv7_CNFG outperforms YOLOv7, with AP increasing from 0.763 to 0.893 on the internal test set and from 0.647 to 0.765 on the external test set. ViT achieves AUCs of 0.945 (95% CI 0.894-0.981) and 0.915 (95% CI 0.860-0.955) for the classification of benign and malignant tumors on the internal and external test sets, respectively. CONCLUSIONS Our study provides a promising strategy for the automatic diagnosis of eyelid tumors, which could potentially improve patient outcomes and reduce healthcare costs.
Collapse
Affiliation(s)
- Jiewei Jiang
- School of Electronic Engineering, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China
| | - Haiyang Liu
- School of Electronic Engineering, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China
| | - Lang He
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China
| | - Mengjie Pei
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China
| | - Tongtong Lin
- School of Electronic Engineering, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China
| | - Hailong Yang
- School of Electronic Engineering, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China
| | - Junhua Yang
- School of Electronic Engineering, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China
| | - Jiamin Gong
- School of Modern Post, Xi'an University of Posts and Telecommunications, Xi'an, 710061, China
| | - Xumeng Wei
- School of Communications and Information Engineering, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China
| | - Mingmin Zhu
- School of Mathematics and Statistics, Xidian University, Xi'an, 710071, China.
| | - Guohai Wu
- Ningbo Eye Hospital, Wenzhou Medical University, Ningbo, 315000, China.
| | - Zhongwen Li
- Ningbo Eye Hospital, Wenzhou Medical University, Ningbo, 315000, China.
- School of Ophthalmology and Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou, 325027, China.
| |
Collapse
|
3
|
Gao XR, Wu F, Yuhas PT, Rasel RK, Chiariglione M. Automated vertical cup-to-disc ratio determination from fundus images for glaucoma detection. Sci Rep 2024; 14:4494. [PMID: 38396048 PMCID: PMC10891153 DOI: 10.1038/s41598-024-55056-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 02/20/2024] [Indexed: 02/25/2024] Open
Abstract
Glaucoma is the leading cause of irreversible blindness worldwide. Often asymptomatic for years, this disease can progress significantly before patients become aware of the loss of visual function. Critical examination of the optic nerve through ophthalmoscopy or using fundus images is a crucial component of glaucoma detection before the onset of vision loss. The vertical cup-to-disc ratio (VCDR) is a key structural indicator for glaucoma, as thinning of the superior and inferior neuroretinal rim is a hallmark of the disease. However, manual assessment of fundus images is both time-consuming and subject to variability based on clinician expertise and interpretation. In this study, we develop a robust and accurate automated system employing deep learning (DL) techniques, specifically the YOLOv7 architecture, for the detection of optic disc and optic cup in fundus images and the subsequent calculation of VCDR. We also address the often-overlooked issue of adapting a DL model, initially trained on a specific population (e.g., European), for VCDR estimation in a different population. Our model was initially trained on ten publicly available datasets and subsequently fine-tuned on the REFUGE dataset, which comprises images collected from Chinese patients. The DL-derived VCDR displayed exceptional accuracy, achieving a Pearson correlation coefficient of 0.91 (P = 4.12 × 10-412) and a mean absolute error (MAE) of 0.0347 when compared to assessments by human experts. Our models also surpassed existing approaches on the REFUGE dataset, demonstrating higher Dice similarity coefficients and lower MAEs. Moreover, we developed an optimization approach capable of calibrating DL results for new populations. Our novel approaches for detecting optic discs and optic cups and calculating VCDR, offers clinicians a promising tool that significantly reduces manual workload in image assessment while improving both speed and accuracy. Most importantly, this automated method effectively differentiates between glaucoma and non-glaucoma cases, making it a valuable asset for glaucoma detection.
Collapse
Affiliation(s)
- Xiaoyi Raymond Gao
- Department of Ophthalmology and Visual Sciences, The Ohio State University, Columbus, OH, 43210, USA.
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, 43210, USA.
- Division of Human Genetics, The Ohio State University, Columbus, OH, 43210, USA.
- College of Optometry, The Ohio State University, Columbus, OH, USA.
| | - Fengze Wu
- Department of Ophthalmology and Visual Sciences, The Ohio State University, Columbus, OH, 43210, USA
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, 43210, USA
| | - Phillip T Yuhas
- College of Optometry, The Ohio State University, Columbus, OH, USA
| | - Rafiul Karim Rasel
- Department of Ophthalmology and Visual Sciences, The Ohio State University, Columbus, OH, 43210, USA
| | - Marion Chiariglione
- Department of Ophthalmology and Visual Sciences, The Ohio State University, Columbus, OH, 43210, USA
| |
Collapse
|
4
|
Zhao S, Yuan Y, Wu X, Wang Y, Zhang F. YOLOv7-TS: A Traffic Sign Detection Model Based on Sub-Pixel Convolution and Feature Fusion. Sensors (Basel) 2024; 24:989. [PMID: 38339706 PMCID: PMC10857214 DOI: 10.3390/s24030989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 01/26/2024] [Accepted: 02/01/2024] [Indexed: 02/12/2024]
Abstract
In recent years, significant progress has been witnessed in the field of deep learning-based object detection. As a subtask in the field of object detection, traffic sign detection has great potential for development. However, the existing object detection methods for traffic sign detection in real-world scenes are plagued by issues such as the omission of small objects and low detection accuracies. To address these issues, a traffic sign detection model named YOLOv7-Traffic Sign (YOLOv7-TS) is proposed based on sub-pixel convolution and feature fusion. Firstly, the up-sampling capability of the sub-pixel convolution integrating channel dimension is harnessed and a Feature Map Extraction Module (FMEM) is devised to mitigate the channel information loss. Furthermore, a Multi-feature Interactive Fusion Network (MIFNet) is constructed to facilitate enhanced information interaction among all feature layers, improving the feature fusion effectiveness and strengthening the perception ability of small objects. Moreover, a Deep Feature Enhancement Module (DFEM) is established to accelerate the pooling process while enriching the highest-layer feature. YOLOv7-TS is evaluated on two traffic sign datasets, namely CCTSDB2021 and TT100K. Compared with YOLOv7, YOLOv7-TS, with a smaller number of parameters, achieves a significant enhancement of 3.63% and 2.68% in the mean Average Precision (mAP) for each respective dataset, proving the effectiveness of the proposed model.
Collapse
Affiliation(s)
| | - Yang Yuan
- School of Software, Henan Polytechnic University, Jiaozuo 454000, China
| | | | | | | |
Collapse
|
5
|
Li Z, Deng Z, Hao K, Zhao X, Jin Z. A Ship Detection Model Based on Dynamic Convolution and an Adaptive Fusion Network for Complex Maritime Conditions. Sensors (Basel) 2024; 24:859. [PMID: 38339576 PMCID: PMC10856874 DOI: 10.3390/s24030859] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 12/08/2023] [Accepted: 01/25/2024] [Indexed: 02/12/2024]
Abstract
Ship detection is vital for maritime safety and vessel monitoring, but challenges like false and missed detections persist, particularly in complex backgrounds, multiple scales, and adverse weather conditions. This paper presents YOLO-Vessel, a ship detection model built upon YOLOv7, which incorporates several innovations to improve its performance. First, we devised a novel backbone network structure called Efficient Layer Aggregation Networks and Omni-Dimensional Dynamic Convolution (ELAN-ODConv). This architecture effectively addresses the complex background interference commonly encountered in maritime ship images, thereby improving the model's feature extraction capabilities. Additionally, we introduce the space-to-depth structure in the head network, which can solve the problem of small ship targets in images that are difficult to detect. Furthermore, we introduced ASFFPredict, a predictive network structure addressing scale variation among ship types, bolstering multiscale ship target detection. Experimental results demonstrate YOLO-Vessel's effectiveness, achieving a 78.3% mean average precision (mAP), surpassing YOLOv7 by 2.3% and Faster R-CNN by 11.6%. It maintains real-time detection at 8.0 ms/frame, meeting real-time ship detection needs. Evaluation in adverse weather conditions confirms YOLO-Vessel's superiority in ship detection, offering a robust solution to maritime challenges and enhancing marine safety and vessel monitoring.
Collapse
Affiliation(s)
- Zhisheng Li
- School of Computer and Information Engineering, Tianjin Chengjian University, Tianjin 300384, China; (Z.L.); (Z.D.); (X.Z.)
| | - Zhihui Deng
- School of Computer and Information Engineering, Tianjin Chengjian University, Tianjin 300384, China; (Z.L.); (Z.D.); (X.Z.)
| | - Kun Hao
- School of Computer and Information Engineering, Tianjin Chengjian University, Tianjin 300384, China; (Z.L.); (Z.D.); (X.Z.)
| | - Xiaofang Zhao
- School of Computer and Information Engineering, Tianjin Chengjian University, Tianjin 300384, China; (Z.L.); (Z.D.); (X.Z.)
| | - Zhigang Jin
- School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China;
| |
Collapse
|
6
|
Zhang Z, Lei X, Huang K, Sun Y, Zeng J, Xyu T, Yuan Q, Qi Y, Herbst A, Lyu X. Multi-scenario pear tree inflorescence detection based on improved YOLOv7 object detection algorithm. Front Plant Sci 2024; 14:1330141. [PMID: 38317836 PMCID: PMC10840500 DOI: 10.3389/fpls.2023.1330141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 12/27/2023] [Indexed: 02/07/2024]
Abstract
Efficient and precise thinning during the orchard blossom period is a crucial factor in enhancing both fruit yield and quality. The accurate recognition of inflorescence is the cornerstone of intelligent blossom equipment. To advance the process of intelligent blossom thinning, this paper addresses the issue of suboptimal performance of current inflorescence recognition algorithms in detecting dense inflorescence at a long distance. It introduces an inflorescence recognition algorithm, YOLOv7-E, based on the YOLOv7 neural network model. YOLOv7 incorporates an efficient multi-scale attention mechanism (EMA) to enable cross-channel feature interaction through parallel processing strategies, thereby maximizing the retention of pixel-level features and positional information on the feature maps. Additionally, the SPPCSPC module is optimized to preserve target area features as much as possible under different receptive fields, and the Soft-NMS algorithm is employed to reduce the likelihood of missing detections in overlapping regions. The model is trained on a diverse dataset collected from real-world field settings. Upon validation, the improved YOLOv7-E object detection algorithm achieves an average precision and recall of 91.4% and 89.8%, respectively, in inflorescence detection under various time periods, distances, and weather conditions. The detection time for a single image is 80.9 ms, and the model size is 37.6 Mb. In comparison to the original YOLOv7 algorithm, it boasts a 4.9% increase in detection accuracy and a 5.3% improvement in recall rate, with a mere 1.8% increase in model parameters. The YOLOv7-E object detection algorithm presented in this study enables precise inflorescence detection and localization across an entire tree at varying distances, offering robust technical support for differentiated and precise blossom thinning operations by thinning machinery in the future.
Collapse
Affiliation(s)
- Zhen Zhang
- School of Agricultural Engineering, Jiangsu University, Zhenjiang, China
- Institute of Agricultural Facilities and Equipment, Jiangsu Academy of Agricultural Sciences, Nanjing, China
- Key Laboratory of Modern Horticultural Equipment, Ministry of Agriculture and Rural Affairs, Nanjing, China
| | - Xiaohui Lei
- Institute of Agricultural Facilities and Equipment, Jiangsu Academy of Agricultural Sciences, Nanjing, China
- Key Laboratory of Modern Horticultural Equipment, Ministry of Agriculture and Rural Affairs, Nanjing, China
| | - Kai Huang
- Institute of Agricultural Facilities and Equipment, Jiangsu Academy of Agricultural Sciences, Nanjing, China
- Key Laboratory of Modern Horticultural Equipment, Ministry of Agriculture and Rural Affairs, Nanjing, China
| | - Yuanhao Sun
- Institute of Agricultural Facilities and Equipment, Jiangsu Academy of Agricultural Sciences, Nanjing, China
- Key Laboratory of Modern Horticultural Equipment, Ministry of Agriculture and Rural Affairs, Nanjing, China
| | - Jin Zeng
- Institute of Agricultural Facilities and Equipment, Jiangsu Academy of Agricultural Sciences, Nanjing, China
- Key Laboratory of Modern Horticultural Equipment, Ministry of Agriculture and Rural Affairs, Nanjing, China
| | - Tao Xyu
- Institute of Agricultural Facilities and Equipment, Jiangsu Academy of Agricultural Sciences, Nanjing, China
- Key Laboratory of Modern Horticultural Equipment, Ministry of Agriculture and Rural Affairs, Nanjing, China
| | - Quanchun Yuan
- Institute of Agricultural Facilities and Equipment, Jiangsu Academy of Agricultural Sciences, Nanjing, China
- Key Laboratory of Modern Horticultural Equipment, Ministry of Agriculture and Rural Affairs, Nanjing, China
| | - Yannan Qi
- Institute of Agricultural Facilities and Equipment, Jiangsu Academy of Agricultural Sciences, Nanjing, China
- Key Laboratory of Modern Horticultural Equipment, Ministry of Agriculture and Rural Affairs, Nanjing, China
| | - Andreas Herbst
- Institute for Chemical Application Technology of JKI, Braunschweig, Germany
| | - Xiaolan Lyu
- Institute of Agricultural Facilities and Equipment, Jiangsu Academy of Agricultural Sciences, Nanjing, China
- Key Laboratory of Modern Horticultural Equipment, Ministry of Agriculture and Rural Affairs, Nanjing, China
| |
Collapse
|
7
|
Chen B, Zhang W, Wu W, Li Y, Chen Z, Li C. ID- YOLOv7: an efficient method for insulator defect detection in power distribution network. Front Neurorobot 2024; 17:1331427. [PMID: 38288312 PMCID: PMC10822988 DOI: 10.3389/fnbot.2023.1331427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 12/31/2023] [Indexed: 01/31/2024] Open
Abstract
Insulators play a pivotal role in the reliability of power distribution networks, necessitating precise defect detection. However, compared with aerial insulator images of transmission network, insulator images of power distribution network contain more complex backgrounds and subtle insulator defects, it leads to high false detection rates and omission rates in current mainstream detection algorithms. In response, this study presents ID-YOLOv7, a tailored convolutional neural network. First, we design a novel Edge Detailed Shape Data Augmentation (EDSDA) method to enhance the model's sensitivity to insulator's edge shapes. Meanwhile, a Cross-Channel and Spatial Multi-Scale Attention (CCSMA) module is proposed, which can interactively model across different channels and spatial domains, to augment the network's attention to high-level insulator defect features. Second, we design a Re-BiC module to fuse multi-scale contextual features and reconstruct the Neck component, alleviating the issue of critical feature loss during inter-feature layer interaction in traditional FPN structures. Finally, we utilize the MPDIoU function to calculate the model's localization loss, effectively reducing redundant computational costs. We perform comprehensive experiments using the Su22kV_broken and PASCAL VOC 2007 datasets to validate our algorithm's effectiveness. On the Su22kV_broken dataset, our approach attains an 85.7% mAP on a single NVIDIA RTX 2080ti graphics card, marking a 7.2% increase over the original YOLOv7. On the PASCAL VOC 2007 dataset, we achieve an impressive 90.3% mAP at a processing speed of 53 FPS, showing a 2.9% improvement compared to the original YOLOv7.
Collapse
Affiliation(s)
- Bojian Chen
- State Grid Fujian Electric Power Research Institute, Fuzhou, China
| | - Weihao Zhang
- State Grid Fujian Electric Power Research Institute, Fuzhou, China
| | - Wenbin Wu
- State Grid Fujian Electric Power Research Institute, Fuzhou, China
| | - Yiran Li
- State Grid Fujian Electric Power Co., Ltd., Fuzhou, China
| | - Zhuolei Chen
- State Grid Fujian Electric Power Research Institute, Fuzhou, China
| | - Chenglong Li
- College of Air Traffic Management, Civil Aviation Flight University of China, Guanghan, China
| |
Collapse
|
8
|
Li J, Zhang W, Zhou H, Yu C, Li Q. Weed detection in soybean fields using improved YOLOv7 and evaluating herbicide reduction efficacy. Front Plant Sci 2024; 14:1284338. [PMID: 38273952 PMCID: PMC10808379 DOI: 10.3389/fpls.2023.1284338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 12/19/2023] [Indexed: 01/27/2024]
Abstract
With the increasing environmental awareness and the demand for sustainable agriculture, herbicide reduction has become an important goal. Accurate and efficient weed detection in soybean fields is the key to test the effectiveness of herbicide application, but current technologies and methods still have some problems in terms of accuracy and efficiency, such as relying on manual detection and poor adaptability to some complex environments. Therefore, in this study, weeding experiments in soybean fields with reduced herbicide application, including four levels, were carried out, and an unmanned aerial vehicle (UAV) was utilized to obtain field images. We proposed a weed detection model-YOLOv7-FWeed-based on improved YOLOv7, adopted F-ReLU as the activation function of the convolution module, and added the MaxPool multihead self-attention (M-MHSA) module to enhance the recognition accuracy of weeds. We continuously monitored changes in soybean leaf area and dry matter weight after herbicide reduction as a reflection of soybean growth at optimal herbicide application levels. The results showed that the herbicide application level of electrostatic spraying + 10% reduction could be used for weeding in soybean fields, and YOLOv7-FWeed was higher than YOLOv7 and YOLOv7-enhanced in all the evaluation indexes. The precision of the model was 0.9496, the recall was 0.9125, the F1 was 0.9307, and the mAP was 0.9662. The results of continuous monitoring of soybean leaf area and dry matter weight showed that herbicide reduction could effectively control weed growth and would not hinder soybean growth. This study can provide a more accurate, efficient, and intelligent solution for weed detection in soybean fields, thus promoting herbicide reduction and providing guidance for exploring efficient herbicide application techniques.
Collapse
Affiliation(s)
- Jinyang Li
- College of Engineering, Heilongjiang Bayi Agricultural University, Daqing, China
| | - Wei Zhang
- College of Engineering, Heilongjiang Bayi Agricultural University, Daqing, China
- Key Laboratory of Soybean Mechanization Production, Ministry of Agriculture and Rural Affairs, Daqing, China
| | - Hong Zhou
- College of Engineering, Heilongjiang Bayi Agricultural University, Daqing, China
| | - Chuntao Yu
- College of Engineering, Heilongjiang Bayi Agricultural University, Daqing, China
| | - Qingda Li
- College of Engineering, Heilongjiang Bayi Agricultural University, Daqing, China
| |
Collapse
|
9
|
Eida S, Fukuda M, Katayama I, Takagi Y, Sasaki M, Mori H, Kawakami M, Nishino T, Ariji Y, Sumi M. Metastatic Lymph Node Detection on Ultrasound Images Using YOLOv7 in Patients with Head and Neck Squamous Cell Carcinoma. Cancers (Basel) 2024; 16:274. [PMID: 38254765 PMCID: PMC10813890 DOI: 10.3390/cancers16020274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 12/28/2023] [Accepted: 01/04/2024] [Indexed: 01/24/2024] Open
Abstract
Ultrasonography is the preferred modality for detailed evaluation of enlarged lymph nodes (LNs) identified on computed tomography and/or magnetic resonance imaging, owing to its high spatial resolution. However, the diagnostic performance of ultrasonography depends on the examiner's expertise. To support the ultrasonographic diagnosis, we developed YOLOv7-based deep learning models for metastatic LN detection on ultrasonography and compared their detection performance with that of highly experienced radiologists and less experienced residents. We enrolled 462 B- and D-mode ultrasound images of 261 metastatic and 279 non-metastatic histopathologically confirmed LNs from 126 patients with head and neck squamous cell carcinoma. The YOLOv7-based B- and D-mode models were optimized using B- and D-mode training and validation images and their detection performance for metastatic LNs was evaluated using B- and D-mode testing images, respectively. The D-mode model's performance was comparable to that of radiologists and superior to that of residents' reading of D-mode images, whereas the B-mode model's performance was higher than that of residents but lower than that of radiologists on B-mode images. Thus, YOLOv7-based B- and D-mode models can assist less experienced residents in ultrasonographic diagnoses. The D-mode model could raise the diagnostic performance of residents to the same level as experienced radiologists.
Collapse
Affiliation(s)
- Sato Eida
- Department of Radiology and Biomedical Informatics, Nagasaki University Graduate School of Biomedical Sciences, 1-7-1 Sakamoto, Nagasaki 852-8588, Japan; (S.E.); (I.K.); (Y.T.); (M.S.); (H.M.); (M.K.); (T.N.)
| | - Motoki Fukuda
- Department of Oral Radiology, Osaka Dental University, 1-5-17 Otemae, Chuo-ku, Osaka 540-0008, Japan; (M.F.); (Y.A.)
| | - Ikuo Katayama
- Department of Radiology and Biomedical Informatics, Nagasaki University Graduate School of Biomedical Sciences, 1-7-1 Sakamoto, Nagasaki 852-8588, Japan; (S.E.); (I.K.); (Y.T.); (M.S.); (H.M.); (M.K.); (T.N.)
| | - Yukinori Takagi
- Department of Radiology and Biomedical Informatics, Nagasaki University Graduate School of Biomedical Sciences, 1-7-1 Sakamoto, Nagasaki 852-8588, Japan; (S.E.); (I.K.); (Y.T.); (M.S.); (H.M.); (M.K.); (T.N.)
| | - Miho Sasaki
- Department of Radiology and Biomedical Informatics, Nagasaki University Graduate School of Biomedical Sciences, 1-7-1 Sakamoto, Nagasaki 852-8588, Japan; (S.E.); (I.K.); (Y.T.); (M.S.); (H.M.); (M.K.); (T.N.)
| | - Hiroki Mori
- Department of Radiology and Biomedical Informatics, Nagasaki University Graduate School of Biomedical Sciences, 1-7-1 Sakamoto, Nagasaki 852-8588, Japan; (S.E.); (I.K.); (Y.T.); (M.S.); (H.M.); (M.K.); (T.N.)
| | - Maki Kawakami
- Department of Radiology and Biomedical Informatics, Nagasaki University Graduate School of Biomedical Sciences, 1-7-1 Sakamoto, Nagasaki 852-8588, Japan; (S.E.); (I.K.); (Y.T.); (M.S.); (H.M.); (M.K.); (T.N.)
| | - Tatsuyoshi Nishino
- Department of Radiology and Biomedical Informatics, Nagasaki University Graduate School of Biomedical Sciences, 1-7-1 Sakamoto, Nagasaki 852-8588, Japan; (S.E.); (I.K.); (Y.T.); (M.S.); (H.M.); (M.K.); (T.N.)
| | - Yoshiko Ariji
- Department of Oral Radiology, Osaka Dental University, 1-5-17 Otemae, Chuo-ku, Osaka 540-0008, Japan; (M.F.); (Y.A.)
| | - Misa Sumi
- Department of Radiology and Biomedical Informatics, Nagasaki University Graduate School of Biomedical Sciences, 1-7-1 Sakamoto, Nagasaki 852-8588, Japan; (S.E.); (I.K.); (Y.T.); (M.S.); (H.M.); (M.K.); (T.N.)
| |
Collapse
|
10
|
Jiang Z, Wu B, Ma L, Zhang H, Lian J. APM- YOLOv7 for Small-Target Water-Floating Garbage Detection Based on Multi-Scale Feature Adaptive Weighted Fusion. Sensors (Basel) 2023; 24:50. [PMID: 38202912 PMCID: PMC10780776 DOI: 10.3390/s24010050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 12/14/2023] [Accepted: 12/19/2023] [Indexed: 01/12/2024]
Abstract
As affected by limited information and the complex background, the accuracy of small-target water-floating garbage detection is low. To increase the detection accuracy, in this research, a small-target detection method based on APM-YOLOv7 (the improved YOLOv7 with ACanny PConv-ELAN and MGA attention) is proposed. Firstly, the adaptive algorithm ACanny (adaptive Canny) for river channel outline extraction is proposed to extract the river channel information from the complex background, mitigating interference of the complex background and more accurately extracting the features of small-target water-floating garbage. Secondly, the lightweight partial convolution (PConv) is introduced, and the partial convolution-efficient layer aggregation network module (PConv-ELAN) is designed in the YOLOv7 network to improve the feature extraction capability of the model from morphologically variable water-floating garbage. Finally, after analyzing the limitations of the YOLOv7 network in small-target detection, a multi-scale gated attention for adaptive weight allocation (MGA) is put forward, which highlights features of small-target garbage and decreases missed detection probability. The experimental results showed that compared with the benchmark YOLOv7, the detection accuracy in the form of the mean Average Precision (mAP) of APM-YOLOv7 was improved by 7.02%, that of mmAP (mAP0.5:0.95) was improved by 3.91%, and Recall was improved by 11.82%, all of which meet the requirements of high-precision and real-time water-floating garbage detection and provide reliable reference for the intelligent management of water-floating garbage.
Collapse
Affiliation(s)
| | - Baijing Wu
- School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China; (Z.J.); (L.M.); (H.Z.); (J.L.)
| | | | | | | |
Collapse
|
11
|
Yu J, Zheng H, Xie L, Zhang L, Yu M, Han J. Enhanced YOLOv7 integrated with small target enhancement for rapid detection of objects on water surfaces. Front Neurorobot 2023; 17:1315251. [PMID: 38162894 PMCID: PMC10757635 DOI: 10.3389/fnbot.2023.1315251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 11/23/2023] [Indexed: 01/03/2024] Open
Abstract
Unmanned surface vessel (USV) target detection algorithms often face challenges such as misdetection and omission of small targets due to significant variations in target scales and susceptibility to interference from complex environments. To address these issues, we propose a small target enhanced YOLOv7 (STE-YOLO) approach. Firstly, we introduce a specialized detection branch designed to identify tiny targets. This enhancement aims to improve the multi-scale target detection capabilities and address difficulties in recognizing targets of different sizes. Secondly, we present the lite visual center (LVC) module, which effectively fuses data from different levels to give more attention to small targets. Additionally, we integrate the lite efficient layer aggregation networks (L-ELAN) into the backbone network to reduce redundant computations and enhance computational efficiency. Lastly, we use Wise-IOU to optimize the loss function definition, thereby improving the model robustness by dynamically optimizing gradient contributions from samples of varying quality. We conducted experiments on the WSODD dataset and the FIOW-Img dataset. The results on the comprehensive WSODD dataset demonstrate that STE-YOLO, when compared to YOLOv7, reduces network parameters by 14% while improving AP50 and APs scores by 2.1% and 1.6%, respectively. Furthermore, when compared to five other leading target detection algorithms, STE-YOLO demonstrates superior accuracy and efficiency.
Collapse
Affiliation(s)
- Jie Yu
- Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering, School of Computer and Information, China Three Gorges University, Yichang, China
- School of Computer and Information, China Three Gorges University, Yichang, China
- State Grid Yichang Electric Power Supply Company, Yichang, China
| | - Hao Zheng
- Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering, School of Computer and Information, China Three Gorges University, Yichang, China
- School of Computer and Information, China Three Gorges University, Yichang, China
| | - Li Xie
- State Grid Yichang Electric Power Supply Company, Yichang, China
| | - Lei Zhang
- Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering, School of Computer and Information, China Three Gorges University, Yichang, China
- School of Computer and Information, China Three Gorges University, Yichang, China
| | - Mei Yu
- Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering, School of Computer and Information, China Three Gorges University, Yichang, China
- School of Computer and Information, China Three Gorges University, Yichang, China
| | - Jin Han
- State Grid Yichang Electric Power Supply Company, Yichang, China
| |
Collapse
|
12
|
Wang G, Luo G, Lian H, Chen L, Wu W, Liu H. Application of Deep Learning in Clinical Settings for Detecting and Classifying Malaria Parasites in Thin Blood Smears. Open Forum Infect Dis 2023; 10:ofad469. [PMID: 37937045 PMCID: PMC10627339 DOI: 10.1093/ofid/ofad469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 09/13/2023] [Indexed: 11/09/2023] Open
Abstract
Background Scarcity of annotated image data sets of thin blood smears makes expert-level differentiation among Plasmodium species challenging. Here, we aimed to establish a deep learning algorithm for identifying and classifying malaria parasites in thin blood smears and evaluate its performance and clinical prospect. Methods You Only Look Once v7 was used as the backbone network for training the artificial intelligence algorithm model. The training, validation, and test sets for each malaria parasite category were randomly selected. A comprehensive analysis was performed on 12 708 thin blood smear images of various infective stages of 12 546 malaria parasites, including P falciparum, P vivax, P malariae, P ovale, P knowlesi, and P cynomolgi. Peripheral blood samples were obtained from 380 patients diagnosed with malaria. Additionally, blood samples from monkeys diagnosed with malaria were used to analyze P cynomolgi. The accuracy for detecting Plasmodium-infected blood cells was assessed through various evaluation metrics. Results The total time to identify 1116 malaria parasites was 13 seconds, with an average analysis time of 0.01 seconds for each parasite in the test set. The average precision was 0.902, with a recall and precision of infected erythrocytes of 96.0% and 94.9%, respectively. Sensitivity and specificity exceeded 96.8% and 99.3%, with an area under the receiver operating characteristic curve >0.999. The highest sensitivity (97.8%) and specificity (99.8%) were observed for trophozoites and merozoites. Conclusions The algorithm can help facilitate the clinical and morphologic examination of malaria parasites.
Collapse
Affiliation(s)
- Geng Wang
- Department of Clinical Laboratory, Peking Union Medical College Hospital, Beijing, China
| | - Guoju Luo
- Department of Clinical Laboratory, Peking Union Medical College Hospital, Beijing, China
| | - Heqing Lian
- Beijing Xiaoying Technology Co, Ltd, Beijing, China
| | - Lei Chen
- Beijing Xiaoying Technology Co, Ltd, Beijing, China
| | - Wei Wu
- Department of Clinical Laboratory, Peking Union Medical College Hospital, Beijing, China
| | - Hui Liu
- Central Laboratory, Yunnan Institute of Parasite Diseases, Puer, China
| |
Collapse
|
13
|
Zhang Z, Huang J, Hei G, Wang W. YOLO-IR-Free: An Improved Algorithm for Real-Time Detection of Vehicles in Infrared Images. Sensors (Basel) 2023; 23:8723. [PMID: 37960423 PMCID: PMC10648278 DOI: 10.3390/s23218723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 10/17/2023] [Accepted: 10/23/2023] [Indexed: 11/15/2023]
Abstract
In the field of object detection algorithms, the task of infrared vehicle detection holds significant importance. By utilizing infrared sensors, this approach detects the thermal radiation emitted by vehicles, enabling robust vehicle detection even during nighttime or adverse weather conditions, thus enhancing traffic safety and the efficiency of intelligent driving systems. Current techniques for infrared vehicle detection encounter difficulties in handling low contrast, detecting small objects, and ensuring real-time performance. In the domain of lightweight object detection algorithms, certain existing methodologies face challenges in effectively balancing detection speed and accuracy for this specific task. In order to address this quandary, this paper presents an improved algorithm, called YOLO-IR-Free, an anchor-free approach based on improved attention mechanism YOLOv7 algorithm for real-time detection of infrared vehicles, to tackle these issues. We introduce a new attention mechanism and network module to effectively capture subtle textures and low-contrast features in infrared images. The use of an anchor-free detection head instead of an anchor-based detection head is employed to enhance detection speed. Experimental results demonstrate that YOLO-IR-Free outperforms other methods in terms of accuracy, recall rate, and average precision scores, while maintaining good real-time performance.
Collapse
Affiliation(s)
- Zixuan Zhang
- College of Automation, Nanjing University of Information Science & Technology, Nanjing 210044, China
| | - Jiong Huang
- Business School, The Chinese University of Hong Kong, Hong Kong 999077, China
| | - Gawen Hei
- School of Physics, Mathematics and Computing, The University of Western Australia, Crawley, WA 6009, Australia
| | - Wei Wang
- Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology (CICAEET), Nanjing University of Information Science & Technology, Nanjing 210044, China
| |
Collapse
|
14
|
Vicente-Martínez JA, Márquez-Olivera M, García-Aliaga A, Hernández-Herrera V. Adaptation of YOLOv7 and YOLOv7_tiny for Soccer-Ball Multi-Detection with DeepSORT for Tracking by Semi-Supervised System. Sensors (Basel) 2023; 23:8693. [PMID: 37960393 PMCID: PMC10650813 DOI: 10.3390/s23218693] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 10/06/2023] [Accepted: 10/07/2023] [Indexed: 11/15/2023]
Abstract
Object recognition and tracking have long been a challenge, drawing considerable attention from analysts and researchers, particularly in the realm of sports, where it plays a pivotal role in refining trajectory analysis. This study introduces a different approach, advancing the detection and tracking of soccer balls through the implementation of a semi-supervised network. Leveraging the YOLOv7 convolutional neural network, and incorporating the focal loss function, the proposed framework achieves a remarkable 95% accuracy in ball detection. This strategy outperforms previous methodologies researched in the bibliography. The integration of focal loss brings a distinctive edge to the model, improving the detection challenge for soccer balls on different fields. This pivotal modification, in tandem with the utilization of the YOLOv7 architecture, results in a marked improvement in accuracy. Following the attainment of this result, the implementation of DeepSORT enriches the study by enabling precise trajectory tracking. In the comparative analysis between versions, the efficacy of this approach is underscored, demonstrating its superiority over conventional methods with default loss function. In the Materials and Methods section, a meticulously curated dataset of soccer balls is assembled. Combining images sourced from freely available digital media with additional images from training sessions and amateur matches taken by ourselves, the dataset contains a total of 6331 images. This diverse dataset enables comprehensive testing, providing a solid foundation for evaluating the model's performance under varying conditions, which is divided by 5731 images for supervised system and the last 600 images for semi-supervised. The results are striking, with an accuracy increase to 95% with the focal loss function. The visual representations of real-world scenarios underscore the model's proficiency in both detection and classification tasks, further affirming its effectiveness, the impact, and the innovative approach. In the discussion, the hardware specifications employed are also touched on, any encountered errors are highlighted, and promising avenues for future research are outlined.
Collapse
Affiliation(s)
- Jorge Armando Vicente-Martínez
- Centro de Investigación e Innovación Tecnológica (CIITEC), Instituto Politécnico Nacional (IPN), Cerrada Cecati s/n Col. Sta. Catarina, Azcapotzalco, Mexico City 02250, Mexico;
| | - Moisés Márquez-Olivera
- Centro de Investigación e Innovación Tecnológica (CIITEC), Instituto Politécnico Nacional (IPN), Cerrada Cecati s/n Col. Sta. Catarina, Azcapotzalco, Mexico City 02250, Mexico;
| | - Abraham García-Aliaga
- Departamento de Deportes, Facultad de Ciencias, de la Actividad Física y del Deporte, INEF, Universidad Politécnica de Madrid, Calle Martín Fierro, 7, 28040 Madrid, Spain;
| | - Viridiana Hernández-Herrera
- Centro de Investigación e Innovación Tecnológica (CIITEC), Instituto Politécnico Nacional (IPN), Cerrada Cecati s/n Col. Sta. Catarina, Azcapotzalco, Mexico City 02250, Mexico;
| |
Collapse
|
15
|
Jia K, Niu Q, Wang L, Niu Y, Ma W. A New Efficient Multi-Object Detection and Size Calculation for Blended Tobacco Shreds Using an Improved YOLOv7 Network and LWC Algorithm. Sensors (Basel) 2023; 23:8380. [PMID: 37896474 PMCID: PMC10610831 DOI: 10.3390/s23208380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 09/30/2023] [Accepted: 10/06/2023] [Indexed: 10/29/2023]
Abstract
Detection of the four tobacco shred varieties and the subsequent unbroken tobacco shred rate are the primary tasks in cigarette inspection lines. It is especially critical to identify both single and overlapped tobacco shreds at one time, that is, fast blended tobacco shred detection based on multiple targets. However, it is difficult to classify tiny single tobacco shreds with complex morphological characteristics, not to mention classifying tobacco shreds with 24 types of overlap, posing significant difficulties for machine vision-based blended tobacco shred multi-object detection and unbroken tobacco shred rate calculation tasks. This study focuses on the two challenges of identifying blended tobacco shreds and calculating the unbroken tobacco shred rate. In this paper, a new multi-object detection model is developed for blended tobacco shred images based on an improved YOLOv7-tiny model. YOLOv7-tiny is used as the multi-object detection network's mainframe. A lightweight Resnet19 is used as the model backbone. The original SPPCSPC and coupled detection head are replaced with a new spatial pyramid SPPFCSPC and a decoupled joint detection head, respectively. An algorithm for two-dimensional size calculation of blended tobacco shreds (LWC) is also proposed, which is applied to blended tobacco shred object detection images to obtain independent tobacco shred objects and calculate the unbroken tobacco shred rate. The experimental results showed that the final detection precision, mAP@.5, mAP@.5:.95, and testing time were 0.883, 0.932, 0.795, and 4.12 ms, respectively. The average length and width detection accuracy of the blended tobacco shred samples were -1.7% and 13.2%, respectively. The model achieved high multi-object detection accuracy and 2D size calculation accuracy, which also conformed to the manual inspection process in the field. This study provides a new efficient implementation method for multi-object detection and size calculation of blended tobacco shreds in cigarette quality inspection lines and a new approach for other similar blended image multi-object detection tasks.
Collapse
Affiliation(s)
| | | | - Li Wang
- College of Electrical Engineering, Henan University of Technology, Zhengzhou 450000, China; (K.J.); (Q.N.); (Y.N.); (W.M.)
| | | | | |
Collapse
|
16
|
Zhang F, Sun H, Xie S, Dong C, Li Y, Xu Y, Zhang Z, Chen F. A tea bud segmentation, detection and picking point localization based on the MDY7-3PTB model. Front Plant Sci 2023; 14:1199473. [PMID: 37841621 PMCID: PMC10570925 DOI: 10.3389/fpls.2023.1199473] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 09/04/2023] [Indexed: 10/17/2023]
Abstract
Introduction The identification and localization of tea picking points is a prerequisite for achieving automatic picking of famous tea. However, due to the similarity in color between tea buds and young leaves and old leaves, it is difficult for the human eye to accurately identify them. Methods To address the problem of segmentation, detection, and localization of tea picking points in the complex environment of mechanical picking of famous tea, this paper proposes a new model called the MDY7-3PTB model, which combines the high-precision segmentation capability of DeepLabv3+ and the rapid detection capability of YOLOv7. This model achieves the process of segmentation first, followed by detection and finally localization of tea buds, resulting in accurate identification of the tea bud picking point. This model replaced the DeepLabv3+ feature extraction network with the more lightweight MobileNetV2 network to improve the model computation speed. In addition, multiple attention mechanisms (CBAM) were fused into the feature extraction and ASPP modules to further optimize model performance. Moreover, to address the problem of class imbalance in the dataset, the Focal Loss function was used to correct data imbalance and improve segmentation, detection, and positioning accuracy. Results and discussion The MDY7-3PTB model achieved a mean intersection over union (mIoU) of 86.61%, a mean pixel accuracy (mPA) of 93.01%, and a mean recall (mRecall) of 91.78% on the tea bud segmentation dataset, which performed better than usual segmentation models such as PSPNet, Unet, and DeeplabV3+. In terms of tea bud picking point recognition and positioning, the model achieved a mean average precision (mAP) of 93.52%, a weighted average of precision and recall (F1 score) of 93.17%, a precision of 97.27%, and a recall of 89.41%. This model showed significant improvements in all aspects compared to existing mainstream YOLO series detection models, with strong versatility and robustness. This method eliminates the influence of the background and directly detects the tea bud picking points with almost no missed detections, providing accurate two-dimensional coordinates for the tea bud picking points, with a positioning precision of 96.41%. This provides a strong theoretical basis for future tea bud picking.
Collapse
Affiliation(s)
- Fenyun Zhang
- School of Automation, Hangzhou Dianzi University, Hangzhou, China
| | - Hongwei Sun
- School of Automation, Hangzhou Dianzi University, Hangzhou, China
| | - Shuang Xie
- School of Automation, Hangzhou Dianzi University, Hangzhou, China
| | - Chunwang Dong
- Tea Research Institute, Shandong Academy of Agricultural Sciences, Jinan, China
| | - You Li
- School of Automation, Hangzhou Dianzi University, Hangzhou, China
| | - Yiting Xu
- School of Automation, Hangzhou Dianzi University, Hangzhou, China
| | - Zhengwei Zhang
- School of Automation, Hangzhou Dianzi University, Hangzhou, China
| | - Fengnong Chen
- School of Automation, Hangzhou Dianzi University, Hangzhou, China
| |
Collapse
|
17
|
Cao W, Chen Z, Deng X, Wu C, Li T. An Identification Method for Irregular Components Related to Terminal Blocks in Equipment Cabinet of Power Substation. Sensors (Basel) 2023; 23:7739. [PMID: 37765796 PMCID: PMC10535969 DOI: 10.3390/s23187739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 08/23/2023] [Accepted: 08/29/2023] [Indexed: 09/29/2023]
Abstract
Despite the continuous advancement of intelligent power substations, the terminal block components within equipment cabinet inspection work still often require loads of personnel. The repetitive documentary works not only lack efficiency but are also susceptible to inaccuracies introduced by substation personnel. To resolve the problem of lengthy, time-consuming inspections, a terminal block component detection and identification method is presented in this paper. The identification method is a multi-stage system that incorporates a streamlined version of You Only Look Once version 7 (YOLOv7), a fusion of YOLOv7 and differential binarization (DB), and the utilization of PaddleOCR. Firstly, the YOLOv7 Area-Oriented (YOLOv7-AO) model is developed to precisely locate the complete region of terminal blocks within substation scene images. The compact area extraction model rapidly cuts out the valid proportion of the input image. Furthermore, the DB segmentation head is integrated into the YOLOv7 model to effectively handle the densely arranged, irregularly shaped block components. To detect all the components within a target electrical cabinet of substation equipment, the YOLOv7 model with a differential binarization attention head (YOLOv7-DBAH) is proposed, integrating spatial and channel attention mechanisms. Finally, a general OCR algorithm is applied to the cropped-out instances after image distortion to match and record the component's identity information. The experimental results show that the YOLOv7-AO model reaches high detection accuracy with good portability, gaining 4.45 times faster running speed. Moreover, the terminal block component detection results show that the YOLOv7-DBAH model achieves the highest evaluation metrics, increasing the F1-score from 0.83 to 0.89 and boosting the precision to over 0.91. The proposed method achieves the goal of terminal block component identification and can be applied in practical situations.
Collapse
Affiliation(s)
- Weiguo Cao
- School of Electrical Engineering, Southeast University, Nanjing 210096, China;
| | - Zhong Chen
- School of Electrical Engineering, Southeast University, Nanjing 210096, China;
| | - Xuhui Deng
- Fuzhou Power Supply Branch, State Grid Fujian Power Company, Fuzhou 350001, China;
| | - Congying Wu
- State Grid Economic and Technological Research Institute Co., Ltd., Biejing 100005, China;
| | - Tiecheng Li
- Power Science and Research Institute of State Grid Hebei Power Co., Beijing 430024, China;
| |
Collapse
|
18
|
Cheng Z, Li Y. Improved YOLOv7 Algorithm for Detecting Bone Marrow Cells. Sensors (Basel) 2023; 23:7640. [PMID: 37688095 PMCID: PMC10490824 DOI: 10.3390/s23177640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 08/29/2023] [Accepted: 08/31/2023] [Indexed: 09/10/2023]
Abstract
The detection and classification of bone marrow (BM) cells is a critical cornerstone for hematology diagnosis. However, the low accuracy caused by few BM-cell data samples, subtle difference between classes, and small target size, pathologists still need to perform thousands of manual identifications daily. To address the above issues, we propose an improved BM-cell-detection algorithm in this paper, called YOLOv7-CTA. Firstly, to enhance the model's sensitivity to fine-grained features, we design a new module called CoTLAN in the backbone network to enable the model to perform long-term modeling between target feature information. Then, in order to cooperate with the CoTLAN module to pay more attention to the features in the area to be detected, we integrate the coordinate attention (CoordAtt) module between the CoTLAN modules to improve the model's attention to small target features. Finally, we cluster the target boxes of the BM cell dataset based on K-means++ to generate more suitable anchor boxes, which accelerates the convergence of the improved model. In addition, in order to solve the imbalance between positive and negative samples in BM-cell pictures, we use the Focal loss function to replace the multi-class cross entropy. Experimental results demonstrate that the best mean average precision (mAP) of the proposed model reaches 88.6%, which is an improvement of 12.9%, 8.3%, and 6.7% compared with that of the Faster R-CNN model, YOLOv5l model, and YOLOv7 model, respectively. This verifies the effectiveness and superiority of the YOLOv7-CTA model in BM-cell-detection tasks.
Collapse
Affiliation(s)
| | - Yuanyuan Li
- School of Mathematics and Physics, Wuhan Institute of Technology, Wuhan 430205, China
| |
Collapse
|
19
|
Chai JJK, Xu JL, O’Sullivan C. Real-Time Detection of Strawberry Ripeness Using Augmented Reality and Deep Learning. Sensors (Basel) 2023; 23:7639. [PMID: 37688097 PMCID: PMC10490577 DOI: 10.3390/s23177639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 08/31/2023] [Accepted: 09/01/2023] [Indexed: 09/10/2023]
Abstract
Currently, strawberry harvesting relies heavily on human labour and subjective assessments of ripeness, resulting in inconsistent post-harvest quality. Therefore, the aim of this work is to automate this process and provide a more accurate and efficient way of assessing ripeness. We explored a unique combination of YOLOv7 object detection and augmented reality technology to detect and visualise the ripeness of strawberries. Our results showed that the proposed YOLOv7 object detection model, which employed transfer learning, fine-tuning and multi-scale training, accurately identified the level of ripeness of each strawberry with an mAP of 0.89 and an F1 score of 0.92. The tiny models have an average detection time of 18 ms per frame at a resolution of 1280 × 720 using a high-performance computer, thereby enabling real-time detection in the field. Our findings distinctly establish the superior performance of YOLOv7 when compared to other cutting-edge methodologies. We also suggest using Microsoft HoloLens 2 to overlay predicted ripeness labels onto each strawberry in the real world, providing a visual representation of the ripeness level. Despite some challenges, this work highlights the potential of augmented reality to assist farmers in harvesting support, which could have significant implications for current agricultural practices.
Collapse
Affiliation(s)
- Jackey J. K. Chai
- School of Computer Science and Statistics, Trinity College Dublin, D02 PN40 Dublin, Ireland; (J.J.K.C.)
| | - Jun-Li Xu
- School of Biosystems and Food Engineering, University College Dublin, D04 V1W8 Dublin, Ireland
| | - Carol O’Sullivan
- School of Computer Science and Statistics, Trinity College Dublin, D02 PN40 Dublin, Ireland; (J.J.K.C.)
| |
Collapse
|
20
|
Li Y, Xu S, Zhu Z, Wang P, Li K, He Q, Zheng Q. EFC-YOLO: An Efficient Surface-Defect-Detection Algorithm for Steel Strips. Sensors (Basel) 2023; 23:7619. [PMID: 37688077 PMCID: PMC10490735 DOI: 10.3390/s23177619] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 08/28/2023] [Accepted: 08/31/2023] [Indexed: 09/10/2023]
Abstract
The pursuit of higher recognition accuracy and speed with smaller model sizes has been a major research topic in the detection of surface defects in steel. In this paper, we propose an improved high-speed and high-precision Efficient Fusion Coordination network (EFC-YOLO) without increasing the model's size. Since modifications to enhance feature extraction in shallow networks tend to affect the speed of model inference, in order to simultaneously ensure the accuracy and speed of detection, we add the improved Fusion-Faster module to the backbone network of YOLOv7. Partial Convolution (PConv) serves as the basic operator of the module, which strengthens the feature-extraction ability of shallow networks while maintaining speed. Additionally, we incorporate the Shortcut Coordinate Attention (SCA) mechanism to better capture the location information dependency, considering both lightweight design and accuracy. The de-weighted Bi-directional Feature Pyramid Network (BiFPN) structure used in the neck part of the network improves the original Path Aggregation Network (PANet)-like structure by adding step branches and reducing computations, achieving better feature fusion. In the experiments conducted on the NEU-DET dataset, the final model achieved an 85.9% mAP and decreased the GFLOPs by 60%, effectively balancing the model's size with the accuracy and speed of detection.
Collapse
Affiliation(s)
| | - Shuobo Xu
- School of Information Science and Electrical Engineering, Shandong Jiaotong University, Jinan 250357, China; (Y.L.); (Z.Z.); (P.W.); (K.L.); (Q.H.); (Q.Z.)
| | | | | | | | | | | |
Collapse
|
21
|
Wen C, Guo H, Li J, Hou B, Huang Y, Li K, Nong H, Long X, Lu Y. Application of improved YOLOv7-based sugarcane stem node recognition algorithm in complex environments. Front Plant Sci 2023; 14:1230517. [PMID: 37680364 PMCID: PMC10481968 DOI: 10.3389/fpls.2023.1230517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Accepted: 07/31/2023] [Indexed: 09/09/2023]
Abstract
Introduction Sugarcane stem node detection is one of the key functions of a small intelligent sugarcane harvesting robot, but the accuracy of sugarcane stem node detection is severely degraded in complex field environments when the sugarcane is in the shadow of confusing backgrounds and other objects. Methods To address the problem of low accuracy of sugarcane arise node detection in complex environments, this paper proposes an improved sugarcane stem node detection model based on YOLOv7. First, the SimAM (A Simple Parameter-Free Attention Module for Convolutional Neural Networks) attention mechanism is added to solve the problem of feature loss due to the loss of image global context information in the convolution process, which improves the detection accuracy of the model in the case of image blurring; Second, the Deformable convolution Network is used to replace some of the traditional convolution layers in the original YOLOv7. Finally, a new bounding box regression loss function WIoU Loss is introduced to solve the problem of unbalanced sample quality, improve the model robustness and generalization ability, and accelerate the convergence speed of the network. Results The experimental results show that the mAP of the improved algorithm model is 94.53% and the F1 value is 92.41, which are 3.43% and 2.21 respectively compared with the YOLOv7 model, and compared with the mAP of the SOTA method which is 94.1%, an improvement of 0.43% is achieved, which effectively improves the detection performance of the target detection model. Discussion This study provides a theoretical basis and technical support for the development of a small intelligent sugarcane harvesting robot, and may also provide a reference for the detection of other types of crops in similar environments.
Collapse
Affiliation(s)
- Chunming Wen
- College of Electronic Information, Guangxi Minzu University, Nanning, China
- Guangxi Key Laboratory of Intelligent Unmanned System and Intelligent Equipment, Nanning, Guangxi, China
- Guangxi Key Laboratory of Hybrid Computation and IC Design Analysis, Nanning, Guangxi, China
| | - Huanyu Guo
- College of Electronic Information, Guangxi Minzu University, Nanning, China
| | - Jianheng Li
- College of Electronic Information, Guangxi Minzu University, Nanning, China
| | - Bingxu Hou
- College of Electronic Information, Guangxi Minzu University, Nanning, China
| | - Youzong Huang
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Nanning, Guangxi, China
| | - Kaihua Li
- College of Electronic Information, Guangxi Minzu University, Nanning, China
- Guangxi Key Laboratory of Intelligent Unmanned System and Intelligent Equipment, Nanning, Guangxi, China
| | - Hongliang Nong
- Technology Development Center, Guangxi Agricultural Machinery Research Institute, Nanning, China
| | - Xiaozhu Long
- Department of Technical Research and Development, Nanning Titanium Silver Technology Co., Nanning, China
| | - Yuchun Lu
- College of Electronic Information, Guangxi Minzu University, Nanning, China
| |
Collapse
|
22
|
Abdusalomov AB, Mukhiddinov M, Whangbo TK. Brain Tumor Detection Based on Deep Learning Approaches and Magnetic Resonance Imaging. Cancers (Basel) 2023; 15:4172. [PMID: 37627200 PMCID: PMC10453020 DOI: 10.3390/cancers15164172] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 08/11/2023] [Accepted: 08/17/2023] [Indexed: 08/27/2023] Open
Abstract
The rapid development of abnormal brain cells that characterizes a brain tumor is a major health risk for adults since it can cause severe impairment of organ function and even death. These tumors come in a wide variety of sizes, textures, and locations. When trying to locate cancerous tumors, magnetic resonance imaging (MRI) is a crucial tool. However, detecting brain tumors manually is a difficult and time-consuming activity that might lead to inaccuracies. In order to solve this, we provide a refined You Only Look Once version 7 (YOLOv7) model for the accurate detection of meningioma, glioma, and pituitary gland tumors within an improved detection of brain tumors system. The visual representation of the MRI scans is enhanced by the use of image enhancement methods that apply different filters to the original pictures. To further improve the training of our proposed model, we apply data augmentation techniques to the openly accessible brain tumor dataset. The curated data include a wide variety of cases, such as 2548 images of gliomas, 2658 images of pituitary, 2582 images of meningioma, and 2500 images of non-tumors. We included the Convolutional Block Attention Module (CBAM) attention mechanism into YOLOv7 to further enhance its feature extraction capabilities, allowing for better emphasis on salient regions linked with brain malignancies. To further improve the model's sensitivity, we have added a Spatial Pyramid Pooling Fast+ (SPPF+) layer to the network's core infrastructure. YOLOv7 now includes decoupled heads, which allow it to efficiently glean useful insights from a wide variety of data. In addition, a Bi-directional Feature Pyramid Network (BiFPN) is used to speed up multi-scale feature fusion and to better collect features associated with tumors. The outcomes verify the efficiency of our suggested method, which achieves a higher overall accuracy in tumor detection than previous state-of-the-art models. As a result, this framework has a lot of potential as a helpful decision-making tool for experts in the field of diagnosing brain tumors.
Collapse
Affiliation(s)
| | | | - Taeg Keun Whangbo
- Department of Computer Engineering, Gachon University, Seongnam-si 13120, Republic of Korea;
| |
Collapse
|
23
|
Wang S, Wu D, Zheng X. TBC- YOLOv7: a refined YOLOv7-based algorithm for tea bud grading detection. Front Plant Sci 2023; 14:1223410. [PMID: 37662161 PMCID: PMC10469839 DOI: 10.3389/fpls.2023.1223410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 07/28/2023] [Indexed: 09/05/2023]
Abstract
Introduction Accurate grading identification of tea buds is a prerequisite for automated tea-picking based on machine vision system. However, current target detection algorithms face challenges in detecting tea bud grades in complex backgrounds. In this paper, an improved YOLOv7 tea bud grading detection algorithm TBC-YOLOv7 is proposed. Methods The TBC-YOLOv7 algorithm incorporates the transformer architecture design in the natural language processing field, integrating the transformer module based on the contextual information in the feature map into the YOLOv7 algorithm, thereby facilitating self-attention learning and enhancing the connection of global feature information. To fuse feature information at different scales, the TBC-YOLOv7 algorithm employs a bidirectional feature pyramid network. In addition, coordinate attention is embedded into the critical positions of the network to suppress useless background details while paying more attention to the prominent features of tea buds. The SIOU loss function is applied as the bounding box loss function to improve the convergence speed of the network. Result The results of the experiments indicate that the TBC-YOLOv7 is effective in all grades of samples in the test set. Specifically, the model achieves a precision of 88.2% and 86.9%, with corresponding recall of 81% and 75.9%. The mean average precision of the model reaches 87.5%, 3.4% higher than the original YOLOv7, with average precision values of up to 90% for one bud with one leaf. Furthermore, the F1 score reaches 0.83. The model's performance outperforms the YOLOv7 model in terms of the number of parameters. Finally, the results of the model detection exhibit a high degree of correlation with the actual manual annotation results ( R 2 =0.89), with the root mean square error of 1.54. Discussion The TBC-YOLOv7 model proposed in this paper exhibits superior performance in vision recognition, indicating that the improved YOLOv7 model fused with transformer-style module can achieve higher grading accuracy on densely growing tea buds, thereby enables the grade detection of tea buds in practical scenarios, providing solution and technical support for automated collection of tea buds and the judging of grades.
Collapse
Affiliation(s)
- Siyang Wang
- College of Mathematics and Computer Science, Zhejiang A&F University, Hangzhou, China
- Key Laboratory of State Forestry and Grassland Administration on Forestry Sensing Technology and Intelligent Equipment, Hangzhou, China
- Key Laboratory of Forestry Intelligent Monitoring and Information Technology of Zhejiang, Hangzhou, China
| | - Dasheng Wu
- College of Mathematics and Computer Science, Zhejiang A&F University, Hangzhou, China
- Key Laboratory of State Forestry and Grassland Administration on Forestry Sensing Technology and Intelligent Equipment, Hangzhou, China
- Key Laboratory of Forestry Intelligent Monitoring and Information Technology of Zhejiang, Hangzhou, China
| | - Xinyu Zheng
- College of Mathematics and Computer Science, Zhejiang A&F University, Hangzhou, China
- Key Laboratory of State Forestry and Grassland Administration on Forestry Sensing Technology and Intelligent Equipment, Hangzhou, China
- Key Laboratory of Forestry Intelligent Monitoring and Information Technology of Zhejiang, Hangzhou, China
| |
Collapse
|
24
|
Li S, Wang S, Wang P. A Small Object Detection Algorithm for Traffic Signs Based on Improved YOLOv7. Sensors (Basel) 2023; 23:7145. [PMID: 37631682 PMCID: PMC10459082 DOI: 10.3390/s23167145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 08/03/2023] [Accepted: 08/11/2023] [Indexed: 08/27/2023]
Abstract
Traffic sign detection is a crucial task in computer vision, finding wide-ranging applications in intelligent transportation systems, autonomous driving, and traffic safety. However, due to the complexity and variability of traffic environments and the small size of traffic signs, detecting small traffic signs in real-world scenes remains a challenging problem. In order to improve the recognition of road traffic signs, this paper proposes a small object detection algorithm for traffic signs based on the improved YOLOv7. First, the small target detection layer in the neck region was added to augment the detection capability for small traffic sign targets. Simultaneously, the integration of self-attention and convolutional mix modules (ACmix) was applied to the newly added small target detection layer, enabling the capture of additional feature information through the convolutional and self-attention channels within ACmix. Furthermore, the feature extraction capability of the convolution modules was enhanced by replacing the regular convolution modules in the neck layer with omni-dimensional dynamic convolution (ODConv). To further enhance the accuracy of small target detection, the normalized Gaussian Wasserstein distance (NWD) metric was introduced to mitigate the sensitivity to minor positional deviations of small objects. The experimental results on the challenging public dataset TT100K demonstrate that the SANO-YOLOv7 algorithm achieved an 88.7% mAP@0.5, outperforming the baseline model YOLOv7 by 5.3%.
Collapse
Affiliation(s)
- Songjiang Li
- College of Computer Science and Technology, Changchun University of Science and Technology, Changchun 130022, China; (S.L.); (S.W.)
| | - Shilong Wang
- College of Computer Science and Technology, Changchun University of Science and Technology, Changchun 130022, China; (S.L.); (S.W.)
| | - Peng Wang
- College of Computer Science and Technology, Changchun University of Science and Technology, Changchun 130022, China; (S.L.); (S.W.)
- Chongqing Research Institute, Changchun University of Science and Technology, Chongqing 401120, China
| |
Collapse
|
25
|
Huang P, Wang S, Chen J, Li W, Peng X. Lightweight Model for Pavement Defect Detection Based on Improved YOLOv7. Sensors (Basel) 2023; 23:7112. [PMID: 37631649 PMCID: PMC10459580 DOI: 10.3390/s23167112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 07/22/2023] [Accepted: 07/27/2023] [Indexed: 08/27/2023]
Abstract
Existing pavement defect detection models face challenges in balancing detection accuracy and speed while being constrained by large parameter sizes, hindering deployment on edge terminal devices with limited computing resources. To address these issues, this paper proposes a lightweight pavement defect detection model based on an improved YOLOv7 architecture. The model introduces four key enhancements: first, the incorporation of the SPPCSPC_Group grouped space pyramid pooling module to reduce the parameter load and computational complexity; second, the utilization of the K-means clustering algorithm for generating anchors, accelerating model convergence; third, the integration of the Ghost Conv module, enhancing feature extraction while minimizing the parameters and calculations; fourth, introduction of the CBAM convolution module to enrich the semantic information in the last layer of the backbone network. The experimental results demonstrate that the improved model achieved an average accuracy of 91%, and the accuracy in detecting broken plates and repaired models increased by 9% and 8%, respectively, compared to the original model. Moreover, the improved model exhibited reductions of 14.4% and 29.3% in the calculations and parameters, respectively, and a 29.1% decrease in the model size, resulting in an impressive 80 FPS (frames per second). The enhanced YOLOv7 successfully balances parameter reduction and computation while maintaining high accuracy, making it a more suitable choice for pavement defect detection compared with other algorithms.
Collapse
Affiliation(s)
| | - Shenghuai Wang
- School of Mechanical Engineering, Hubei University of Automotive Technology, Shiyan 442002, China; (P.H.); (J.C.); (W.L.); (X.P.)
| | | | | | | |
Collapse
|
26
|
Avazov K, Jamil MK, Muminov B, Abdusalomov AB, Cho YI. Fire Detection and Notification Method in Ship Areas Using Deep Learning and Computer Vision Approaches. Sensors (Basel) 2023; 23:7078. [PMID: 37631614 PMCID: PMC10458310 DOI: 10.3390/s23167078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 08/02/2023] [Accepted: 08/07/2023] [Indexed: 08/27/2023]
Abstract
Fire incidents occurring onboard ships cause significant consequences that result in substantial effects. Fires on ships can have extensive and severe wide-ranging impacts on matters such as the safety of the crew, cargo, the environment, finances, reputation, etc. Therefore, timely detection of fires is essential for quick responses and powerful mitigation. The study in this research paper presents a fire detection technique based on YOLOv7 (You Only Look Once version 7), incorporating improved deep learning algorithms. The YOLOv7 architecture, with an improved E-ELAN (extended efficient layer aggregation network) as its backbone, serves as the basis of our fire detection system. Its enhanced feature fusion technique makes it superior to all its predecessors. To train the model, we collected 4622 images of various ship scenarios and performed data augmentation techniques such as rotation, horizontal and vertical flips, and scaling. Our model, through rigorous evaluation, showcases enhanced capabilities of fire recognition to improve maritime safety. The proposed strategy successfully achieves an accuracy of 93% in detecting fires to minimize catastrophic incidents. Objects having visual similarities to fire may lead to false prediction and detection by the model, but this can be controlled by expanding the dataset. However, our model can be utilized as a real-time fire detector in challenging environments and for small-object detection. Advancements in deep learning models hold the potential to enhance safety measures, and our proposed model in this paper exhibits this potential. Experimental results proved that the proposed method can be used successfully for the protection of ships and in monitoring fires in ship port areas. Finally, we compared the performance of our method with those of recently reported fire-detection approaches employing widely used performance matrices to test the fire classification results achieved.
Collapse
Affiliation(s)
- Kuldoshbay Avazov
- Department of Computer Engineering, Gachon University, Seongnam-si 461-701, Republic of Korea; (K.A.)
| | - Muhammad Kafeel Jamil
- Department of Computer Engineering, Gachon University, Seongnam-si 461-701, Republic of Korea; (K.A.)
| | - Bahodir Muminov
- Department of Artificial Intelligence, Tashkent State University of Economics, Tashkent 100066, Uzbekistan
| | | | - Young-Im Cho
- Department of Computer Engineering, Gachon University, Seongnam-si 461-701, Republic of Korea; (K.A.)
| |
Collapse
|
27
|
Chen IDS, Yang CM, Chen MJ, Chen MC, Weng RM, Yeh CH. Deep Learning-Based Recognition of Periodontitis and Dental Caries in Dental X-ray Images. Bioengineering (Basel) 2023; 10:911. [PMID: 37627796 PMCID: PMC10451544 DOI: 10.3390/bioengineering10080911] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 07/21/2023] [Accepted: 07/22/2023] [Indexed: 08/27/2023] Open
Abstract
Dental X-ray images are important and useful for dentists to diagnose dental diseases. Utilizing deep learning in dental X-ray images can help dentists quickly and accurately identify common dental diseases such as periodontitis and dental caries. This paper applies image processing and deep learning technologies to dental X-ray images to propose a simultaneous recognition method for periodontitis and dental caries. The single-tooth X-ray image is detected by the YOLOv7 object detection technique and cropped from the periapical X-ray image. Then, it is processed through contrast-limited adaptive histogram equalization to enhance the local contrast, and bilateral filtering to eliminate noise while preserving the edge. The deep learning architecture for classification comprises a pre-trained EfficientNet-B0 and fully connected layers that output two labels by the sigmoid activation function for the classification task. The average precision of tooth detection using YOLOv7 is 97.1%. For the recognition of periodontitis, the area under the curve (AUC) of the receiver operating characteristic (ROC) curve is 98.67%, and the AUC of the precision-recall (PR) curve is 98.38%. For the recognition of dental caries, the AUC of the ROC curve is 98.31%, and the AUC of the PR curve is 97.55%. Different from the conventional deep learning-based methods for a single disease such as periodontitis or dental caries, the proposed approach can provide the recognition of both periodontitis and dental caries simultaneously. This recognition method presents good performance in the identification of periodontitis and dental caries, thus facilitating dental diagnosis.
Collapse
Affiliation(s)
| | - Chieh-Ming Yang
- Department of Electrical Engineering, National Dong Hwa University, Hualien 97401, Taiwan
| | - Mei-Juan Chen
- Department of Electrical Engineering, National Dong Hwa University, Hualien 97401, Taiwan
| | - Ming-Chin Chen
- Department of Electrical Engineering, National Dong Hwa University, Hualien 97401, Taiwan
| | - Ro-Min Weng
- Department of Electrical Engineering, National Dong Hwa University, Hualien 97401, Taiwan
| | - Chia-Hung Yeh
- Department of Electrical Engineering, National Taiwan Normal University, Taipei 10610, Taiwan
- Department of Electrical Engineering, National Sun Yat-sen University, Kaohsiung 80424, Taiwan
| |
Collapse
|
28
|
Abstract
Introduction The issue of low detection rates and high false negative rates in maritime search and rescue operations has been a critical problem in current target detection algorithms. This is mainly due to the complex maritime environment and the small size of most targets. These challenges affect the algorithms' robustness and generalization. Methods We proposed YOLOv7-CSAW, an improved maritime search and rescue target detection algorithm based on YOLOv7. We used the K-means++ algorithm for the optimal size determination of prior anchor boxes, ensuring an accurate match with actual objects. The C2f module was incorporated for a lightweight model capable of obtaining richer gradient flow information. The model's perception of small target features was increased with the non-parameter simple attention module (SimAM). We further upgraded the feature fusion network to an adaptive feature fusion network (ASFF) to address the lack of high-level semantic features in small targets. Lastly, we implemented the wise intersection over union (WIoU) loss function to tackle large positioning errors and missed detections. Results Our algorithm was extensively tested on a maritime search and rescue dataset with YOLOv7 as the baseline model. We observed a significant improvement in the detection performance compared to traditional deep learning algorithms, with a mean average precision (mAP) improvement of 10.73% over the baseline model. Discussion YOLOv7-CSAW significantly enhances the accuracy and robustness of small target detection in complex scenes. This algorithm effectively addresses the common issues experienced in maritime search and rescue operations, specifically improving the detection rates and reducing false negatives, proving to be a superior alternative to current target detection algorithms.
Collapse
|
29
|
Zhang J, Liu S, Yuan H, Yong R, Duan S, Li Y, Spencer J, Lim EG, Yu L, Song P. Deep Learning for Microfluidic-Assisted Caenorhabditis elegans Multi-Parameter Identification Using YOLOv7. Micromachines (Basel) 2023; 14:1339. [PMID: 37512650 PMCID: PMC10386376 DOI: 10.3390/mi14071339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 06/14/2023] [Accepted: 06/27/2023] [Indexed: 07/30/2023]
Abstract
The Caenorhabditis elegans (C. elegans) is an ideal model organism for studying human diseases and genetics due to its transparency and suitability for optical imaging. However, manually sorting a large population of C. elegans for experiments is tedious and inefficient. The microfluidic-assisted C. elegans sorting chip is considered a promising platform to address this issue due to its automation and ease of operation. Nevertheless, automated C. elegans sorting with multiple parameters requires efficient identification technology due to the different research demands for worm phenotypes. To improve the efficiency and accuracy of multi-parameter sorting, we developed a deep learning model using You Only Look Once (YOLO)v7 to detect and recognize C. elegans automatically. We used a dataset of 3931 annotated worms in microfluidic chips from various studies. Our model showed higher precision in automated C. elegans identification than YOLOv5 and Faster R-CNN, achieving a mean average precision (mAP) at a 0.5 intersection over a union (mAP@0.5) threshold of 99.56%. Additionally, our model demonstrated good generalization ability, achieving an mAP@0.5 of 94.21% on an external validation set. Our model can efficiently and accurately identify and calculate multiple phenotypes of worms, including size, movement speed, and fluorescence. The multi-parameter identification model can improve sorting efficiency and potentially promote the development of automated and integrated microfluidic platforms.
Collapse
Affiliation(s)
- Jie Zhang
- School of Advanced Technology, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China
- Department of Electrical and Electronic Engineering, University of Liverpool, Liverpool L693BX, UK
| | - Shuhe Liu
- School of Advanced Technology, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China
| | - Hang Yuan
- School of Advanced Technology, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China
| | - Ruiqi Yong
- School of Advanced Technology, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China
| | - Sixuan Duan
- School of Advanced Technology, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China
- Department of Electrical and Electronic Engineering, University of Liverpool, Liverpool L693BX, UK
| | - Yifan Li
- School of Advanced Technology, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China
- Department of Electrical and Electronic Engineering, University of Liverpool, Liverpool L693BX, UK
| | - Joseph Spencer
- Department of Electrical and Electronic Engineering, University of Liverpool, Liverpool L693BX, UK
| | - Eng Gee Lim
- School of Advanced Technology, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China
- Department of Electrical and Electronic Engineering, University of Liverpool, Liverpool L693BX, UK
| | - Limin Yu
- School of Advanced Technology, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China
- Department of Electrical and Electronic Engineering, University of Liverpool, Liverpool L693BX, UK
| | - Pengfei Song
- School of Advanced Technology, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China
- Department of Electrical and Electronic Engineering, University of Liverpool, Liverpool L693BX, UK
| |
Collapse
|
30
|
Kim SY, Muminov A. Forest Fire Smoke Detection Based on Deep Learning Approaches and Unmanned Aerial Vehicle Images. Sensors (Basel) 2023; 23:5702. [PMID: 37420867 DOI: 10.3390/s23125702] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 06/14/2023] [Accepted: 06/16/2023] [Indexed: 07/09/2023]
Abstract
Wildfire poses a significant threat and is considered a severe natural disaster, which endangers forest resources, wildlife, and human livelihoods. In recent times, there has been an increase in the number of wildfire incidents, and both human involvement with nature and the impacts of global warming play major roles in this. The rapid identification of fire starting from early smoke can be crucial in combating this issue, as it allows firefighters to respond quickly to the fire and prevent it from spreading. As a result, we proposed a refined version of the YOLOv7 model for detecting smoke from forest fires. To begin, we compiled a collection of 6500 UAV pictures of smoke from forest fires. To further enhance YOLOv7's feature extraction capabilities, we incorporated the CBAM attention mechanism. Then, we added an SPPF+ layer to the network's backbone to better concentrate smaller wildfire smoke regions. Finally, decoupled heads were introduced into the YOLOv7 model to extract useful information from an array of data. A BiFPN was used to accelerate multi-scale feature fusion and acquire more specific features. Learning weights were introduced in the BiFPN so that the network can prioritize the most significantly affecting characteristic mapping of the result characteristics. The testing findings on our forest fire smoke dataset revealed that the proposed approach successfully detected forest fire smoke with an AP50 of 86.4%, 3.9% higher than previous single- and multiple-stage object detectors.
Collapse
Affiliation(s)
- Soon-Young Kim
- Department of Physical Education, Gachon University, Seongnam 13120, Republic of Korea
| | - Azamjon Muminov
- Department of Computer Engineering, Gachon University, Seongnam 13120, Republic of Korea
| |
Collapse
|
31
|
Li J, Tian Y, Chen J, Wang H. Rock Crack Recognition Technology Based on Deep Learning. Sensors (Basel) 2023; 23:5421. [PMID: 37420588 DOI: 10.3390/s23125421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2023] [Revised: 05/22/2023] [Accepted: 06/06/2023] [Indexed: 07/09/2023]
Abstract
The changes in cracks on the surface of rock mass reflect the development of geological disasters, so cracks on the surface of rock mass are early signs of geological disasters such as landslides, collapses, and debris flows. To research geological disasters, it is crucial to swiftly and precisely gather crack information on the surface of rock masses. Drone videography surveys can effectively avoid the limitations of the terrain. This has become an essential method in disaster investigation. This manuscript proposes rock crack recognition technology based on deep learning. First, images of cracks on the surface of a rock mass obtained by a drone were cut into small pictures of 640 × 640. Next, a VOC dataset was produced for crack object detection by enhancing the data with data augmentation techniques, labeling the image using Labelimg. Then, we divided the data into test sets and training sets in a ratio of 2:8. Then, the YOLOv7 model was improved by combining different attention mechanisms. This study is the first to combine YOLOv7 and an attention mechanism for rock crack detection. Finally, the rock crack recognition technology was obtained through comparative analysis. The results show that the precision of the improved model using the SimAM attention mechanism can reach 100%, the recall rate can achieve 75%, the AP can reach 96.89%, and the processing time per 100 images is 10 s, which is the optimal model compared with the other five models. The improvement is relative to the original model, in which the precision was improved by 1.67%, the recall by 1.25%, and the AP by 1.45%, with no decrease in running speed. This proves that rock crack recognition technology based on deep learning can achieve rapid and precise results. It provides a new research direction for identifying early signs of geological hazards.
Collapse
Affiliation(s)
- Jinbei Li
- School of Hydraulic Engineering, Dalian University of Technology, Dalian 116024, China
| | - Yu Tian
- Department of Water Resources Research, China Institute of Water Resources and Hydropower Research, Beijing 100038, China
| | - Juan Chen
- Department of Water Resources Research, China Institute of Water Resources and Hydropower Research, Beijing 100038, China
| | - Hao Wang
- School of Hydraulic Engineering, Dalian University of Technology, Dalian 116024, China
- Department of Water Resources Research, China Institute of Water Resources and Hydropower Research, Beijing 100038, China
| |
Collapse
|
32
|
Zhang C, Hu Z, Xu L, Zhao Y. A YOLOv7 incorporating the Adan optimizer based corn pests identification method. Front Plant Sci 2023; 14:1174556. [PMID: 37342143 PMCID: PMC10277678 DOI: 10.3389/fpls.2023.1174556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Accepted: 05/02/2023] [Indexed: 06/22/2023]
Abstract
Major pests of corn insects include corn borer, armyworm, bollworm, aphid, and corn leaf mites. Timely and accurate detection of these pests is crucial for effective pests control and scientific decision making. However, existing methods for identification based on traditional machine learning and neural networks are limited by high model training costs and low recognition accuracy. To address these problems, we proposed a YOLOv7 maize pests identification method incorporating the Adan optimizer. First, we selected three major corn pests, corn borer, armyworm and bollworm as research objects. Then, we collected and constructed a corn pests dataset by using data augmentation to address the problem of scarce corn pests data. Second, we chose the YOLOv7 network as the detection model, and we proposed to replace the original optimizer of YOLOv7 with the Adan optimizer for its high computational cost. The Adan optimizer can efficiently sense the surrounding gradient information in advance, allowing the model to escape sharp local minima. Thus, the robustness and accuracy of the model can be improved while significantly reducing the computing power. Finally, we did ablation experiments and compared the experiments with traditional methods and other common object detection networks. Theoretical analysis and experimental result show that the model incorporating with Adan optimizer only requires 1/2-2/3 of the computing power of the original network to obtain performance beyond that of the original network. The mAP@[.5:.95] (mean Average Precision) of the improved network reaches 96.69% and the precision reaches 99.95%. Meanwhile, the mAP@[.5:.95] was improved by 2.79%-11.83% compared to the original YOLOv7 and 41.98%-60.61% compared to other common object detection models. In complex natural scenes, our proposed method is not only time-efficient and has higher recognition accuracy, reaching the level of SOTA.
Collapse
Affiliation(s)
- Chong Zhang
- School of Information and Communication Engineering, State Key Laboratory of Marine Resource Utilization in South China Sea, Hainan University, Haikou, China
| | - Zhuhua Hu
- School of Information and Communication Engineering, State Key Laboratory of Marine Resource Utilization in South China Sea, Hainan University, Haikou, China
| | - Lewei Xu
- School of Information and Communication Engineering, State Key Laboratory of Marine Resource Utilization in South China Sea, Hainan University, Haikou, China
| | - Yaochi Zhao
- School of Cyberspace Security, State Key Laboratory of Marine Resource Utilization in South China Sea, Hainan University, Haikou, China
| |
Collapse
|
33
|
Ni Y, Mao J, Fu Y, Wang H, Zong H, Luo K. Damage Detection and Localization of Bridge Deck Pavement Based on Deep Learning. Sensors (Basel) 2023; 23:s23115138. [PMID: 37299865 DOI: 10.3390/s23115138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 05/23/2023] [Accepted: 05/25/2023] [Indexed: 06/12/2023]
Abstract
Bridge deck pavement damage has a significant effect on the driving safety and long-term durability of bridges. To achieve the damage detection and localization of bridge deck pavement, a three-stage detection method based on the you-only-look-once version 7 (YOLOv7) network and the revised LaneNet was proposed in this study. In stage 1, the Road Damage Dataset 202 (RDD2022) is preprocessed and adopted to train the YOLOv7 model, and five classes of damage were obtained. In stage 2, the LaneNet network was pruned to retain the semantic segmentation part, with the VGG16 network as an encoder to generate lane line binary images. In stage 3, the lane line binary images were post-processed by a proposed image processing algorithm to obtain the lane area. Based on the damage coordinates from stage 1, the final pavement damage classes and lane localization were obtained. The proposed method was compared and analyzed in the RDD2022 dataset, and was applied on the Fourth Nanjing Yangtze River Bridge in China. The results shows that the mean average precision (mAP) of YOLOv7 on the preprocessed RDD2022 dataset reaches 0.663, higher than that of other models in the YOLO series. The accuracy of the lane localization of the revised LaneNet is 0.933, higher than that of instance segmentation, 0.856. Meanwhile, the inference speed of the revised LaneNet is 12.3 frames per second (FPS) on NVIDIA GeForce RTX 3090, higher than that of instance segmentation 6.53 FPS. The proposed method can provide a reference for the maintenance of bridge deck pavement.
Collapse
Affiliation(s)
- Youhao Ni
- Key Laboratory of Concrete and Prestressed Concrete Structures of Ministry of Education, Southeast University, Nanjing 210096, China
| | - Jianxiao Mao
- Key Laboratory of Concrete and Prestressed Concrete Structures of Ministry of Education, Southeast University, Nanjing 210096, China
| | - Yuguang Fu
- School of Civil and Environmental Engineering, Nanyang Technological University, Singapore 639798, Singapore
| | - Hao Wang
- Key Laboratory of Concrete and Prestressed Concrete Structures of Ministry of Education, Southeast University, Nanjing 210096, China
| | - Hai Zong
- School of Transportation, Southeast University, Nanjing 210096, China
- Nanjing Highway Development (Group) Co., Ltd., Nanjing 210096, China
| | - Kun Luo
- Key Laboratory of Concrete and Prestressed Concrete Structures of Ministry of Education, Southeast University, Nanjing 210096, China
| |
Collapse
|
34
|
Chen X, Pu H, He Y, Lai M, Zhang D, Chen J, Pu H. An Efficient Method for Monitoring Birds Based on Object Detection and Multi-Object Tracking Networks. Animals (Basel) 2023; 13:ani13101713. [PMID: 37238144 DOI: 10.3390/ani13101713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 05/14/2023] [Accepted: 05/16/2023] [Indexed: 05/28/2023] Open
Abstract
To protect birds, it is crucial to identify their species and determine their population across different regions. However, currently, bird monitoring methods mainly rely on manual techniques, such as point counts conducted by researchers and ornithologists in the field. This method can sometimes be inefficient, prone to errors, and have limitations, which may not always be conducive to bird conservation efforts. In this paper, we propose an efficient method for wetland bird monitoring based on object detection and multi-object tracking networks. First, we construct a manually annotated dataset for bird species detection, annotating the entire body and head of each bird separately, comprising 3737 bird images. We also built a new dataset containing 11,139 complete, individual bird images for the multi-object tracking task. Second, we perform comparative experiments using a state-of-the-art batch of object detection networks, and the results demonstrated that the YOLOv7 network, trained with a dataset labeling the entire body of the bird, was the most effective method. To enhance YOLOv7 performance, we added three GAM modules on the head side of the YOLOv7 to minimize information diffusion and amplify global interaction representations and utilized Alpha-IoU loss to achieve more accurate bounding box regression. The experimental results revealed that the improved method offers greater accuracy, with mAP@0.5 improving to 0.951 and mAP@0.5:0.95 improving to 0.815. Then, we send the detection information to DeepSORT for bird tracking and classification counting. Finally, we use the area counting method to count according to the species of birds to obtain information about flock distribution. The method described in this paper effectively addresses the monitoring challenges in bird conservation.
Collapse
Affiliation(s)
- Xian Chen
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
| | - Hongli Pu
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
| | - Yihui He
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
| | - Mengzhen Lai
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
| | - Daike Zhang
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
| | - Junyang Chen
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
| | - Haibo Pu
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
- Ya'an Digital Agricultural Engineering Technology Research Center, Ya'an 625000, China
| |
Collapse
|
35
|
Mortada MJ, Tomassini S, Anbar H, Morettini M, Burattini L, Sbrollini A. Segmentation of Anatomical Structures of the Left Heart from Echocardiographic Images Using Deep Learning. Diagnostics (Basel) 2023; 13:diagnostics13101683. [PMID: 37238168 DOI: 10.3390/diagnostics13101683] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 04/19/2023] [Accepted: 05/08/2023] [Indexed: 05/28/2023] Open
Abstract
Knowledge about the anatomical structures of the left heart, specifically the atrium (LA) and ventricle (i.e., endocardium-Vendo-and epicardium-LVepi) is essential for the evaluation of cardiac functionality. Manual segmentation of cardiac structures from echocardiography is the baseline reference, but results are user-dependent and time-consuming. With the aim of supporting clinical practice, this paper presents a new deep-learning (DL)-based tool for segmenting anatomical structures of the left heart from echocardiographic images. Specifically, it was designed as a combination of two convolutional neural networks, the YOLOv7 algorithm and a U-Net, and it aims to automatically segment an echocardiographic image into LVendo, LVepi and LA. The DL-based tool was trained and tested on the Cardiac Acquisitions for Multi-Structure Ultrasound Segmentation (CAMUS) dataset of the University Hospital of St. Etienne, which consists of echocardiographic images from 450 patients. For each patient, apical two- and four-chamber views at end-systole and end-diastole were acquired and annotated by clinicians. Globally, our DL-based tool was able to segment LVendo, LVepi and LA, providing Dice similarity coefficients equal to 92.63%, 85.59%, and 87.57%, respectively. In conclusion, the presented DL-based tool proved to be reliable in automatically segmenting the anatomical structures of the left heart and supporting the cardiological clinical practice.
Collapse
Affiliation(s)
- Mhd Jafar Mortada
- Department of Information Engineering, Università Politecnica delle Marche, 60121 Ancona, Italy
| | - Selene Tomassini
- Department of Information Engineering, Università Politecnica delle Marche, 60121 Ancona, Italy
| | - Haidar Anbar
- Department of Information Engineering, Università Politecnica delle Marche, 60121 Ancona, Italy
| | - Micaela Morettini
- Department of Information Engineering, Università Politecnica delle Marche, 60121 Ancona, Italy
| | - Laura Burattini
- Department of Information Engineering, Università Politecnica delle Marche, 60121 Ancona, Italy
| | - Agnese Sbrollini
- Department of Information Engineering, Università Politecnica delle Marche, 60121 Ancona, Italy
| |
Collapse
|
36
|
Nadeem H, Javed K, Nadeem Z, Khan MJ, Rubab S, Yon DK, Naqvi RA. Road Feature Detection for Advance Driver Assistance System Using Deep Learning. Sensors (Basel) 2023; 23:s23094466. [PMID: 37177670 PMCID: PMC10181670 DOI: 10.3390/s23094466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 04/27/2023] [Accepted: 04/27/2023] [Indexed: 05/15/2023]
Abstract
Hundreds of people are injured or killed in road accidents. These accidents are caused by several intrinsic and extrinsic factors, including the attentiveness of the driver towards the road and its associated features. These features include approaching vehicles, pedestrians, and static fixtures, such as road lanes and traffic signs. If a driver is made aware of these features in a timely manner, a huge chunk of these accidents can be avoided. This study proposes a computer vision-based solution for detecting and recognizing traffic types and signs to help drivers pave the door for self-driving cars. A real-world roadside dataset was collected under varying lighting and road conditions, and individual frames were annotated. Two deep learning models, YOLOv7 and Faster RCNN, were trained on this custom-collected dataset to detect the aforementioned road features. The models produced mean Average Precision (mAP) scores of 87.20% and 75.64%, respectively, along with class accuracies of over 98.80%; all of these were state-of-the-art. The proposed model provides an excellent benchmark to build on to help improve traffic situations and enable future technological advances, such as Advance Driver Assistance System (ADAS) and self-driving cars.
Collapse
Affiliation(s)
- Hamza Nadeem
- Engineering and Management Sciences, Balochistan University of Information Technology Engineering & Management Sciences, Quetta 87300, Pakistan
- School of Mechanical and Manufacturing Engineering, National University of Science and Technology, Islamabad 44000, Pakistan
| | - Kashif Javed
- School of Mechanical and Manufacturing Engineering, National University of Science and Technology, Islamabad 44000, Pakistan
| | - Zain Nadeem
- Engineering and Management Sciences, Balochistan University of Information Technology Engineering & Management Sciences, Quetta 87300, Pakistan
- School of Mechanical and Manufacturing Engineering, National University of Science and Technology, Islamabad 44000, Pakistan
| | - Muhammad Jawad Khan
- School of Mechanical and Manufacturing Engineering, National University of Science and Technology, Islamabad 44000, Pakistan
| | - Saddaf Rubab
- Department of Computer Engineering, College of Computing and Informatics, University of Sharjah, Sharjah 27272, United Arab Emirates
| | - Dong Keon Yon
- Center for Digital Health, Medical Science Research Institute, Kyung Hee University Medical Center, Kyung Hee University College of Medicine, Seoul 02447, Republic of Korea
| | - Rizwan Ali Naqvi
- Department of Unmanned Vehicle Engineering, Sejong University, Seoul 05006, Republic of Korea
| |
Collapse
|
37
|
Zhou J, Zhang Y, Wang J. A Dragon Fruit Picking Detection Method Based on YOLOv7 and PSP-Ellipse. Sensors (Basel) 2023; 23:3803. [PMID: 37112144 PMCID: PMC10141975 DOI: 10.3390/s23083803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Revised: 04/03/2023] [Accepted: 04/05/2023] [Indexed: 06/19/2023]
Abstract
Dragon fruit is one of the most popular fruits in China and Southeast Asia. It, however, is mainly picked manually, imposing high labor intensity on farmers. The hard branches and complex postures of dragon fruit make it difficult to achieve automated picking. For picking dragon fruits with diverse postures, this paper proposes a new dragon fruit detection method, not only to identify and locate the dragon fruit, but also to detect the endpoints that are at the head and root of the dragon fruit, which can provide more visual information for the dragon fruit picking robot. First, YOLOv7 is used to locate and classify the dragon fruit. Then, we propose a PSP-Ellipse method to further detect the endpoints of the dragon fruit, including dragon fruit segmentation via PSPNet, endpoints positioning via an ellipse fitting algorithm and endpoints classification via ResNet. To test the proposed method, some experiments are conducted. In dragon fruit detection, the precision, recall and average precision of YOLOv7 are 0.844, 0.924 and 0.932, respectively. YOLOv7 also performs better compared with some other models. In dragon fruit segmentation, the segmentation performance of PSPNet on dragon fruit is better than some other commonly used semantic segmentation models, with the segmentation precision, recall and mean intersection over union being 0.959, 0.943 and 0.906, respectively. In endpoints detection, the distance error and angle error of endpoints positioning based on ellipse fitting are 39.8 pixels and 4.3°, and the classification accuracy of endpoints based on ResNet is 0.92. The proposed PSP-Ellipse method makes a great improvement compared with two kinds of keypoint regression method based on ResNet and UNet. Orchard picking experiments verified that the method proposed in this paper is effective. The detection method proposed in this paper not only promotes the progress of the automatic picking of dragon fruit, but it also provides a reference for other fruit detection.
Collapse
Affiliation(s)
- Jialiang Zhou
- School of Mechanical and Electronic Engineering, Nanjing Forestry University, Nanjing 210037, China
- Co-Innovation Center of Efficient Processing and Utilization of Forest Resources, Nanjing Forestry University, Nanjing 210037, China
| | - Yueyue Zhang
- School of Mechanical and Electronic Engineering, Nanjing Forestry University, Nanjing 210037, China
- Co-Innovation Center of Efficient Processing and Utilization of Forest Resources, Nanjing Forestry University, Nanjing 210037, China
| | - Jinpeng Wang
- School of Mechanical and Electronic Engineering, Nanjing Forestry University, Nanjing 210037, China
- Co-Innovation Center of Efficient Processing and Utilization of Forest Resources, Nanjing Forestry University, Nanjing 210037, China
| |
Collapse
|
38
|
Azurmendi I, Zulueta E, Lopez-Guede JM, Azkarate J, González M. Cooktop Sensing Based on a YOLO Object Detection Algorithm. Sensors (Basel) 2023; 23:2780. [PMID: 36904983 PMCID: PMC10007026 DOI: 10.3390/s23052780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2023] [Revised: 02/25/2023] [Accepted: 02/27/2023] [Indexed: 06/18/2023]
Abstract
Deep Learning (DL) has provided a significant breakthrough in many areas of research and industry. The development of Convolutional Neural Networks (CNNs) has enabled the improvement of computer vision-based techniques, making the information gathered from cameras more useful. For this reason, recently, studies have been carried out on the use of image-based DL in some areas of people's daily life. In this paper, an object detection-based algorithm is proposed to modify and improve the user experience in relation to the use of cooking appliances. The algorithm can sense common kitchen objects and identify interesting situations for users. Some of these situations are the detection of utensils on lit hobs, recognition of boiling, smoking and oil in kitchenware, and determination of good cookware size adjustment, among others. In addition, the authors have achieved sensor fusion by using a cooker hob with Bluetooth connectivity, so it is possible to automatically interact with it via an external device such as a computer or a mobile phone. Our main contribution focuses on supporting people when they are cooking, controlling heaters, or alerting them with different types of alarms. To the best of our knowledge, this is the first time a YOLO algorithm has been used to control the cooktop by means of visual sensorization. Moreover, this research paper provides a comparison of the detection performance among different YOLO networks. Additionally, a dataset of more than 7500 images has been generated and multiple data augmentation techniques have been compared. The results show that YOLOv5s can successfully detect common kitchen objects with high accuracy and fast speed, and it can be employed for realistic cooking environment applications. Finally, multiple examples of the identification of interesting situations and how we act on the cooktop are presented.
Collapse
Affiliation(s)
- Iker Azurmendi
- Department of Systems and Automatic Control, Faculty of Engineering of Vitoria-Gasteiz, University of the Basque Country (UPV/EHU), Nieves Cano, 01006 Vitoria-Gasteiz, Spain
- CS Centro Stirling S. Coop., Avda. Álava 3, 20550 Aretxabaleta, Spain
| | - Ekaitz Zulueta
- Department of Systems and Automatic Control, Faculty of Engineering of Vitoria-Gasteiz, University of the Basque Country (UPV/EHU), Nieves Cano, 01006 Vitoria-Gasteiz, Spain
| | - Jose Manuel Lopez-Guede
- Department of Systems and Automatic Control, Faculty of Engineering of Vitoria-Gasteiz, University of the Basque Country (UPV/EHU), Nieves Cano, 01006 Vitoria-Gasteiz, Spain
| | - Jon Azkarate
- CS Centro Stirling S. Coop., Avda. Álava 3, 20550 Aretxabaleta, Spain
| | - Manuel González
- CS Centro Stirling S. Coop., Avda. Álava 3, 20550 Aretxabaleta, Spain
| |
Collapse
|
39
|
Wang Y, Fu B, Fu L, Xia C. In Situ Sea Cucumber Detection across Multiple Underwater Scenes Based on Convolutional Neural Networks and Image Enhancements. Sensors (Basel) 2023; 23:2037. [PMID: 36850633 PMCID: PMC9962839 DOI: 10.3390/s23042037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 01/30/2023] [Accepted: 02/09/2023] [Indexed: 06/18/2023]
Abstract
Recently, rapidly developing artificial intelligence and computer vision techniques have provided technical solutions to promote production efficiency and reduce labor costs in aquaculture and marine resource surveys. Traditional manual surveys are being replaced by advanced intelligent technologies. However, underwater object detection and recognition are suffering from the image distortion and degradation issues. In this work, automatic monitoring of sea cucumber in natural conditions is implemented based on a state-of-the-art object detector, YOLOv7. To depress the image distortion and degradation issues, image enhancement methods are adopted to improve the accuracy and stability of sea cucumber detection across multiple underwater scenes. Five well-known image enhancement methods are employed to improve the detection performance of sea cucumber by YOLOv7 and YOLOv5. The effectiveness of these image enhancement methods is evaluated by experiments. Non-local image dehazing (NLD) was the most effective in sea cucumber detection from multiple underwater scenes for both YOLOv7 and YOLOv5. The best average precision (AP) of sea cucumber detection was 0.940, achieved by YOLOv7 with NLD. With NLD enhancement, the APs of YOLOv7 and YOLOv5 were increased by 1.1% and 1.6%, respectively. The best AP was 2.8% higher than YOLOv5 without image enhancement. Moreover, the real-time ability of YOLOv7 was examined and its average prediction time was 4.3 ms. Experimental results demonstrated that the proposed method can be applied to marine organism surveying by underwater mobile platforms or automatic analysis of underwater videos.
Collapse
Affiliation(s)
- Yi Wang
- Coastal Defense College, Naval Aeronautical University, Yantai 264003, China
| | - Boya Fu
- Yantai Institute of Coastal Zone Research, Chinese Academy of Sciences, Yantai 264003, China
| | - Longwen Fu
- Yantai Institute of Coastal Zone Research, Chinese Academy of Sciences, Yantai 264003, China
| | - Chunlei Xia
- Yantai Institute of Coastal Zone Research, Chinese Academy of Sciences, Yantai 264003, China
| |
Collapse
|
40
|
Zhang Y, Sun Y, Wang Z, Jiang Y. YOLOv7-RAR for Urban Vehicle Detection. Sensors (Basel) 2023; 23:1801. [PMID: 36850399 PMCID: PMC9964850 DOI: 10.3390/s23041801] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 01/16/2023] [Accepted: 01/29/2023] [Indexed: 06/18/2023]
Abstract
Aiming at the problems of high missed detection rates of the YOLOv7 algorithm for vehicle detection on urban roads, weak perception of small targets in perspective, and insufficient feature extraction, the YOLOv7-RAR recognition algorithm is proposed. The algorithm is improved from the following three directions based on YOLOv7. Firstly, in view of the insufficient nonlinear feature fusion of the original backbone network, the Res3Unit structure is used to reconstruct the backbone network of YOLOv7 to improve the ability of the network model architecture to obtain more nonlinear features. Secondly, in view of the problem that there are many interference backgrounds in urban roads and that the original network is weak in positioning targets such as vehicles, a plug-and-play hybrid attention mechanism module, ACmix, is added after the SPPCSPC layer of the backbone network to enhance the network's attention to vehicles and reduce the interference of other targets. Finally, aiming at the problem that the receptive field of the original network Narrows, with the deepening of the network model, leads to a high miss rate of small targets, the Gaussian receptive field scheme used in the RFLA (Gaussian-receptive-field-based label assignment) module is used at the connection between the feature fusion area and the detection head to improve the receptive field of the network model for small objects in the image. Combining the three improvement measures, the first letter of the name of each improvement measure is selected, and the improved algorithm is named the YOLOv7-RAR algorithm. Experiments show that on urban roads with crowded vehicles and different weather patterns, the average detection accuracy of the YOLOv7-RAR algorithm reaches 95.1%, which is 2.4% higher than that of the original algorithm; the AP50:90 performance is 12.6% higher than that of the original algorithm. The running speed of the YOLOv7-RAR algorithm reaches 96 FPS, which meets the real-time requirements of vehicle detection; hence, the algorithm can be better applied to vehicle detection.
Collapse
|
41
|
Yang Z, Zhao C, Maeda H, Sekimoto Y. Development of a Large-Scale Roadside Facility Detection Model Based on the Mapillary Dataset. Sensors (Basel) 2022; 22:9992. [PMID: 36560361 PMCID: PMC9781587 DOI: 10.3390/s22249992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/12/2022] [Revised: 12/13/2022] [Accepted: 12/15/2022] [Indexed: 06/17/2023]
Abstract
The detection of road facilities or roadside structures is essential for high-definition (HD) maps and intelligent transportation systems (ITSs). With the rapid development of deep-learning algorithms in recent years, deep-learning-based object detection techniques have provided more accurate and efficient performance, and have become an essential tool for HD map reconstruction and advanced driver-assistance systems (ADASs). Therefore, the performance evaluation and comparison of the latest deep-learning algorithms in this field is indispensable. However, most existing works in this area limit their focus to the detection of individual targets, such as vehicles or pedestrians and traffic signs, from driving view images. In this study, we present a systematic comparison of three recent algorithms for large-scale multi-class road facility detection, namely Mask R-CNN, YOLOx, and YOLOv7, on the Mapillary dataset. The experimental results are evaluated according to the recall, precision, mean F1-score and computational consumption. YOLOv7 outperforms the other two networks in road facility detection, with a precision and recall of 87.57% and 72.60%, respectively. Furthermore, we test the model performance on our custom dataset obtained from the Japanese road environment. The results demonstrate that models trained on the Mapillary dataset exhibit sufficient generalization ability. The comparison presented in this study aids in understanding the strengths and limitations of the latest networks in multiclass object detection on large-scale street-level datasets.
Collapse
Affiliation(s)
- Zhehui Yang
- Center for Spatial Information Science, The University of Tokyo, Tokyo 277-8568, Japan
| | - Chenbo Zhao
- Center for Spatial Information Science, The University of Tokyo, Tokyo 277-8568, Japan
| | - Hiroya Maeda
- Urban X Technologies, Shibuya-ku, Tokyo 150-0002, Japan
| | - Yoshihide Sekimoto
- Center for Spatial Information Science, The University of Tokyo, Tokyo 277-8568, Japan
| |
Collapse
|
42
|
Nguyen HV, Bae JH, Lee YE, Lee HS, Kwon KR. Comparison of Pre-Trained YOLO Models on Steel Surface Defects Detector Based on Transfer Learning with GPU-Based Embedded Devices. Sensors (Basel) 2022; 22:s22249926. [PMID: 36560304 PMCID: PMC9783860 DOI: 10.3390/s22249926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 12/09/2022] [Accepted: 12/12/2022] [Indexed: 05/14/2023]
Abstract
Steel is one of the most basic ingredients, which plays an important role in the machinery industry. However, the steel surface defects heavily affect its quality. The demand for surface defect detectors draws much attention from researchers all over the world. However, there are still some drawbacks, e.g., the dataset is limited accessible or small-scale public, and related works focus on developing models but do not deeply take into account real-time applications. In this paper, we investigate the feasibility of applying stage-of-the-art deep learning methods based on YOLO models as real-time steel surface defect detectors. Particularly, we compare the performance of YOLOv5, YOLOX, and YOLOv7 while training them with a small-scale open-source NEU-DET dataset on GPU RTX 2080. From the experiment results, YOLOX-s achieves the best accuracy of 89.6% mAP on the NEU-DET dataset. Then, we deploy the weights of trained YOLO models on Nvidia devices to evaluate their real-time performance. Our experiments devices consist of Nvidia Jetson Nano and Jetson Xavier AGX. We also apply some real-time optimization techniques (i.e., exporting to TensorRT, lowering the precision to FP16 or INT8 and reducing the input image size to 320 × 320) to reduce detection speed (fps), thus also reducing the mAP accuracy.
Collapse
Affiliation(s)
- Hoan-Viet Nguyen
- Intown Co., Ltd., No. 401, 21, Centum 6-ro, Haeundae-gu, Busan 08592, Republic of Korea
- Department of Artificial Intelligence Convergence, Pukyong National University, Busan 48513, Republic of Korea
| | - Jun-Hee Bae
- Intown Co., Ltd., No. 401, 21, Centum 6-ro, Haeundae-gu, Busan 08592, Republic of Korea
| | - Yong-Eun Lee
- Intown Co., Ltd., No. 401, 21, Centum 6-ro, Haeundae-gu, Busan 08592, Republic of Korea
| | - Han-Sung Lee
- Intown Co., Ltd., No. 401, 21, Centum 6-ro, Haeundae-gu, Busan 08592, Republic of Korea
| | - Ki-Ryong Kwon
- Department of Artificial Intelligence Convergence, Pukyong National University, Busan 48513, Republic of Korea
- Correspondence: or ; Tel.: +82-51-629-6257
| |
Collapse
|
43
|
Chen J, Liu H, Zhang Y, Zhang D, Ouyang H, Chen X. A Multiscale Lightweight and Efficient Model Based on YOLOv7: Applied to Citrus Orchard. Plants (Basel) 2022; 11:plants11233260. [PMID: 36501301 PMCID: PMC9738521 DOI: 10.3390/plants11233260] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 11/20/2022] [Accepted: 11/23/2022] [Indexed: 06/01/2023]
Abstract
With the gradual increase in the annual production of citrus, the efficiency of human labor has become the bottleneck limiting production. To achieve an unmanned citrus picking technology, the detection accuracy, prediction speed, and lightweight deployment of the model are important issues. Traditional object detection methods often fail to achieve balanced effects in all aspects. Therefore, an improved YOLOv7 network model is proposed, which introduces a small object detection layer, lightweight convolution, and a CBAM (Convolutional Block Attention Module) attention mechanism to achieve multi-scale feature extraction and fusion and reduce the number of parameters of the model. The performance of the model was tested on the test set of citrus fruit. The average accuracy (mAP@0.5) reached 97.29%, the average prediction time was 69.38 ms, and the number of parameters and computation costs were reduced by 11.21 M and 28.71 G compared with the original YOLOv7. At the same time, the Citrus-YOLOv7 model's results show that it performs better compared with the current state-of-the-art network models. Therefore, the proposed Citrus-YOLOv7 model can contribute to solving the problem of citrus detection.
Collapse
Affiliation(s)
- Junyang Chen
- College of Information Engineering, Sichuan Agricultural University, Ya’an 625000, China
| | - Hui Liu
- College of Information Engineering, Sichuan Agricultural University, Ya’an 625000, China
| | - Yating Zhang
- College of Information Engineering, Sichuan Agricultural University, Ya’an 625000, China
| | - Daike Zhang
- College of Information Engineering, Sichuan Agricultural University, Ya’an 625000, China
| | - Hongkun Ouyang
- College of Mechanical and Electrical Engineering, Sichuan Agricultural University, Ya’an 625000, China
| | - Xiaoyan Chen
- College of Information Engineering, Sichuan Agricultural University, Ya’an 625000, China
- Sichuan Key Laboratory of Agricultural Information Engineering, Ya’an 625000, China
| |
Collapse
|
44
|
Zheng J, Wu H, Zhang H, Wang Z, Xu W. Insulator-Defect Detection Algorithm Based on Improved YOLOv7. Sensors (Basel) 2022; 22:8801. [PMID: 36433397 PMCID: PMC9697038 DOI: 10.3390/s22228801] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 11/04/2022] [Accepted: 11/08/2022] [Indexed: 06/16/2023]
Abstract
Existing detection methods face a huge challenge in identifying insulators with minor defects when targeting transmission line images with complex backgrounds. To ensure the safe operation of transmission lines, an improved YOLOv7 model is proposed to improve detection results. Firstly, the target boxes of the insulator dataset are clustered based on K-means++ to generate more suitable anchor boxes for detecting insulator-defect targets. Secondly, the Coordinate Attention (CoordAtt) module and HorBlock module are added to the network. Then, in the channel and spatial domains, the network can enhance the effective features of the feature-extraction process and weaken the ineffective features. Finally, the SCYLLA-IoU (SIoU) and focal loss functions are used to accelerate the convergence of the model and solve the imbalance of positive and negative samples. Furthermore, to optimize the overall performance of the model, the method of non-maximum suppression (NMS) is improved to reduce accidental deletion and false detection of defect targets. The experimental results show that the mean average precision of our model is 93.8%, higher than the Faster R-CNN model, the YOLOv7 model, and YOLOv5s model by 7.6%, 3.7%, and 4%, respectively. The proposed YOLOv7 model can effectively realize the accurate detection of small objects in complex backgrounds.
Collapse
Affiliation(s)
- Jianfeng Zheng
- School of Mechanical Engineering and Rail Transit, Changzhou University, Changzhou 213164, China
- Jiangsu Province Engineering Research Center of High-Level Energy and Power Equipment, Changzhou University, Changzhou 213164, China
| | - Hang Wu
- School of Mechanical Engineering and Rail Transit, Changzhou University, Changzhou 213164, China
| | - Han Zhang
- Key Laboratory of Noise and Vibration, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China
| | - Zhaoqi Wang
- School of Mechanical Engineering and Rail Transit, Changzhou University, Changzhou 213164, China
| | - Weiyue Xu
- School of Mechanical Engineering and Rail Transit, Changzhou University, Changzhou 213164, China
- Jiangsu Province Engineering Research Center of High-Level Energy and Power Equipment, Changzhou University, Changzhou 213164, China
| |
Collapse
|
45
|
Yang Z, Ni C, Li L, Luo W, Qin Y. Three-Stage Pavement Crack Localization and Segmentation Algorithm Based on Digital Image Processing and Deep Learning Techniques. Sensors (Basel) 2022; 22:8459. [PMID: 36366156 PMCID: PMC9656577 DOI: 10.3390/s22218459] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/15/2022] [Revised: 10/30/2022] [Accepted: 10/31/2022] [Indexed: 06/16/2023]
Abstract
The image of expressway asphalt pavement crack disease obtained by a three-dimensional line scan laser is easily affected by external factors such as uneven illumination distribution, environmental noise, occlusion shadow, and foreign bodies on the pavement. To locate and extract cracks accurately and efficiently, this article proposes a three-stage asphalt pavement crack location and segmentation method based on traditional digital image processing technology and deep learning methods. In the first stage of this method, the guided filtering and Retinex methods are used to preprocess the asphalt pavement crack image. The processed image removes redundant noise information and improves the brightness. At the information entropy level, it is 63% higher than the unpreprocessed image. In the second stage, the newly proposed YOLO-SAMT target detection model is used to locate the crack diseases in asphalt pavement. The model is 5.42 percentage points higher than the original YOLOv7 model on mAP@0.5, which enhances the recognition and location ability of crack diseases and reduces the calculation amount for the extraction of crack contour in the next stage. In the third stage, the improved k-means clustering algorithm is used to extract cracks. Compared with the traditional k-means clustering algorithm, this method improves the accuracy by 7.34 percentage points, the true rate by 6.57 percentage points, and the false positive rate by 18.32 percentage points to better extract the crack contour. To sum up, the method proposed in this article improves the quality of the pavement disease image, enhances the ability to identify and locate cracks, reduces the amount of calculation, improves the accuracy of crack contour extraction, and provides a new solution for highway crack inspection.
Collapse
Affiliation(s)
- Zhen Yang
- College of Transportation and Civil Engineering, Fujian Agriculture and Forestry University, Fuzhou 350108, China
| | - Changshuang Ni
- College of Transportation and Civil Engineering, Fujian Agriculture and Forestry University, Fuzhou 350108, China
| | - Lin Li
- College of Transportation and Civil Engineering, Fujian Agriculture and Forestry University, Fuzhou 350108, China
- College of Transportation Engineering, Nanjing Tech University, Nanjing 211816, China
| | - Wenting Luo
- College of Transportation Engineering, Nanjing Tech University, Nanjing 211816, China
| | - Yong Qin
- School of Traffic and Transportation, Beijing Jiaotong University, Beijing 100044, China
| |
Collapse
|