1
|
Yin Z, Wang Z, Fan C, Wang X, Qiu T. Edge Detection via Fusion Difference Convolution. SENSORS (BASEL, SWITZERLAND) 2023; 23:6883. [PMID: 37571663 PMCID: PMC10422205 DOI: 10.3390/s23156883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 07/30/2023] [Accepted: 07/31/2023] [Indexed: 08/13/2023]
Abstract
Edge detection is a crucial step in many computer vision tasks, and in recent years, models based on deep convolutional neural networks (CNNs) have achieved human-level performance in edge detection. However, we have observed that CNN-based methods rely on pre-trained backbone networks and generate edge images with unwanted background details. We propose four new fusion difference convolution (FDC) structures that integrate traditional gradient operators into modern CNNs. At the same time, we have also added a channel spatial attention module (CSAM) and an up-sampling module (US). These structures allow the model to better recognize the semantic and edge information in the images. Our model is trained from scratch on the BIPED dataset without any pre-trained weights and achieves promising results. Moreover, it generalizes well to other datasets without fine-tuning.
Collapse
Affiliation(s)
- Zhenyu Yin
- Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang 110168, China; (Z.W.); (C.F.); (X.W.); (T.Q.)
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zisong Wang
- Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang 110168, China; (Z.W.); (C.F.); (X.W.); (T.Q.)
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Chao Fan
- Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang 110168, China; (Z.W.); (C.F.); (X.W.); (T.Q.)
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiaohui Wang
- Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang 110168, China; (Z.W.); (C.F.); (X.W.); (T.Q.)
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Tong Qiu
- Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang 110168, China; (Z.W.); (C.F.); (X.W.); (T.Q.)
- School of Computer Science and Technology, Shenyang University of Chemical Technology, Shenyang 110142, China
| |
Collapse
|
2
|
Zhou L, Lin C, Pang X, Yang H, Pan Y, Zhang Y. Learning parallel and hierarchical mechanisms for edge detection. Front Neurosci 2023; 17:1194713. [PMID: 37559703 PMCID: PMC10407095 DOI: 10.3389/fnins.2023.1194713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 07/03/2023] [Indexed: 08/11/2023] Open
Abstract
Edge detection is one of the fundamental components of advanced computer vision tasks, and it is essential to preserve computational resources while ensuring a certain level of performance. In this paper, we propose a lightweight edge detection network called the Parallel and Hierarchical Network (PHNet), which draws inspiration from the parallel processing and hierarchical processing mechanisms of visual information in the visual cortex neurons and is implemented via a convolutional neural network (CNN). Specifically, we designed an encoding network with parallel and hierarchical processing based on the visual information transmission pathway of the "retina-LGN-V1" and meticulously modeled the receptive fields of the cells involved in the pathway. Empirical evaluation demonstrates that, despite a minimal parameter count of only 0.2 M, the proposed model achieves a remarkable ODS score of 0.781 on the BSDS500 dataset and ODS score of 0.863 on the MBDD dataset. These results underscore the efficacy of the proposed network in attaining superior edge detection performance at a low computational cost. Moreover, we believe that this study, which combines computational vision and biological vision, can provide new insights into edge detection model research.
Collapse
Affiliation(s)
- Ling Zhou
- Key Laboratory of AI and Information Processing (Hechi University), Education Department of Guangxi Zhuang Autonomous Region, Hechi University, Yizhou, China
| | - Chuan Lin
- Key Laboratory of AI and Information Processing (Hechi University), Education Department of Guangxi Zhuang Autonomous Region, Hechi University, Yizhou, China
- School of Automation, Guangxi University of Science and Technology, Liuzhou, China
- Guangxi Key Laboratory of Automobile Components and Vehicle Technology, Guangxi University of Science and Technology, Liuzhou, China
| | - Xintao Pang
- Key Laboratory of AI and Information Processing (Hechi University), Education Department of Guangxi Zhuang Autonomous Region, Hechi University, Yizhou, China
- School of Automation, Guangxi University of Science and Technology, Liuzhou, China
- Guangxi Key Laboratory of Automobile Components and Vehicle Technology, Guangxi University of Science and Technology, Liuzhou, China
| | - Hao Yang
- School of Automation, Guangxi University of Science and Technology, Liuzhou, China
- Guangxi Key Laboratory of Automobile Components and Vehicle Technology, Guangxi University of Science and Technology, Liuzhou, China
| | - Yongcai Pan
- School of Automation, Guangxi University of Science and Technology, Liuzhou, China
- Guangxi Key Laboratory of Automobile Components and Vehicle Technology, Guangxi University of Science and Technology, Liuzhou, China
| | - Yuwei Zhang
- School of Automation, Guangxi University of Science and Technology, Liuzhou, China
- Guangxi Key Laboratory of Automobile Components and Vehicle Technology, Guangxi University of Science and Technology, Liuzhou, China
| |
Collapse
|
3
|
Yao Y, Zhang Z, Peng B, Tang J. Bio-Inspired Network for Diagnosing Liver Steatosis in Ultrasound Images. Bioengineering (Basel) 2023; 10:768. [PMID: 37508795 PMCID: PMC10376777 DOI: 10.3390/bioengineering10070768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 06/15/2023] [Accepted: 06/23/2023] [Indexed: 07/30/2023] Open
Abstract
Using ultrasound imaging to diagnose liver steatosis is of great significance for preventing diseases such as cirrhosis and liver cancer. Accurate diagnosis under conditions of low quality, noise and poor resolutions is still a challenging task. Physiological studies have shown that the visual cortex of the biological visual system has selective attention neural mechanisms and feedback regulation of high features to low features. When processing visual information, these cortical regions selectively focus on more sensitive information and ignore unimportant details, which can effectively extract important features from visual information. Inspired by this, we propose a new diagnostic network for hepatic steatosis. In order to simulate the selection mechanism and feedback regulation of the visual cortex in the ventral pathway, it consists of a receptive field feature extraction module, parallel attention module and feedback connection. The receptive field feature extraction module corresponds to the inhibition of the non-classical receptive field of V1 neurons on the classical receptive field. It processes the input image to suppress the unimportant background texture. Two types of attention are adopted in the parallel attention module to process the same visual information and extract different important features for fusion, which improves the overall performance of the model. In addition, we construct a new dataset of fatty liver ultrasound images and validate the proposed model on this dataset. The experimental results show that the network has good performance in terms of sensitivity, specificity and accuracy for the diagnosis of fatty liver disease.
Collapse
Affiliation(s)
- Yuan Yao
- General Practice Medical Center, West China Hospital, Sichuan University, Chengdu 610044, China
| | - Zhenguang Zhang
- School of Automation, Guangxi University of Science and Technology, Liuzhou 545006, China
| | - Bo Peng
- School of Computing and Artificial Intelligent, Southwest Jiaotong University, Chengdu 611756, China
| | - Jin Tang
- Tiaodenghe Community Health Service Center, Chengdu 610066, China
| |
Collapse
|
4
|
Lin C, Qiao Y, Pan Y. Bio-inspired interactive feedback neural networks for edge detection. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04316-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
|
5
|
Zhang Z, Lin C, Qiao Y, Pan Y. Edge detection networks inspired by neural mechanisms of selective attention in biological visual cortex. Front Neurosci 2022; 16:1073484. [PMID: 36483183 PMCID: PMC9724618 DOI: 10.3389/fnins.2022.1073484] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Accepted: 11/07/2022] [Indexed: 07/30/2023] Open
Abstract
Edge detection is of great importance to the middle and high-level vision task in computer vision, and it is useful to improve its performance. This paper is different from previous edge detection methods designed only for decoding networks. We propose a new edge detection network composed of modulation coding network and decoding network. Among them, modulation coding network is the combination of modulation enhancement network and coding network designed by using the self-attention mechanism in Transformer, which is inspired by the selective attention mechanism of V1, V2, and V4 in biological vision. The modulation enhancement network effectively enhances the feature extraction ability of the encoding network, realizes the selective extraction of the global features of the input image, and improves the performance of the entire model. In addition, we designed a new decoding network based on the function of integrating feature information in the IT layer of the biological vision system. Unlike previous decoding networks, it combines top-down decoding and bottom-up decoding, uses down-sampling decoding to extract more features, and then achieves better performance by fusing up-sampling decoding features. We evaluated the proposed method experimentally on multiple publicly available datasets BSDS500, NYUD-V2, and barcelona images for perceptual edge detection (BIPED). Among them, the best performance is achieved on the NYUD and BIPED datasets, and the second result is achieved on the BSDS500. Experimental results show that this method is highly competitive among all methods.
Collapse
|
6
|
Chen Y, Lin C, Qiao Y. DPED: Bio-inspired dual-pathway network for edge detection. Front Bioeng Biotechnol 2022; 10:1008140. [PMID: 36312545 PMCID: PMC9606659 DOI: 10.3389/fbioe.2022.1008140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Accepted: 09/20/2022] [Indexed: 11/24/2022] Open
Abstract
Edge detection is significant as the basis of high-level visual tasks. Most encoder-decoder edge detection methods used convolutional neural networks, such as VGG16 or Resnet, as the encoding network. Studies on designing decoding networks have achieved good results. Swin Transformer (Swin) has recently attracted much attention in various visual tasks as a possible alternative to convolutional neural networks. Physiological studies have shown that there are two visual pathways that converge in the visual cortex in the biological vision system, and that complex information transmission and communication is widespread. Inspired by the research on Swin and the biological vision pathway, we have designed a two-pathway encoding network. The first pathway network is the fine-tuned Swin; the second pathway network mainly comprises deep separable convolution. To simulate attention transmission and feature fusion between the first and second pathway networks, we have designed a second-pathway attention module and a pathways fusion module. Our proposed method outperforms the CNN-based SOTA method BDCN on BSDS500 datasets. Moreover, our proposed method and the Transformer-based SOTA method EDTER have their own performance advantages. In terms of FLOPs and FPS, our method has more benefits than EDTER.
Collapse
|
7
|
|
8
|
|
9
|
A Suspicious Multi-Object Detection and Recognition Method for Millimeter Wave SAR Security Inspection Images Based on Multi-Path Extraction Network. REMOTE SENSING 2021. [DOI: 10.3390/rs13244978] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
There are several major challenges in detecting and recognizing multiple hidden objects from millimeter wave SAR security inspection images: inconsistent clarity of objects, similar objects, and complex background interference. To address these problems, a suspicious multi-object detection and recognition method based on the Multi-Path Extraction Network (MPEN) is proposed. In MPEN, You Only Look Once (YOLO) v3 is used as the base network, and then the Multi-Path Feature Pyramid (MPFP) module and modified residual block distribution are proposed. MPFP is designed to output the deep network feature layers separately. Then, to distinguish similar objects more easily, the residual block distribution is modified to improve the ability of the shallow network to capture details. To verify the effectiveness of the proposed method, the millimeter wave SAR images from the laboratory’s self-developed security inspection system are utilized in conducting research on multi-object detection and recognition. The detection rate (probability of detecting a target) and average false alarm (probability of error detection) rate of our method on the target are 94.6% and 14.6%, respectively. The mean Average Precision (mAP) of recognizing multi-object is 82.39%. Compared with YOLOv3, our method shows a better performance in detecting and recognizing similar targets.
Collapse
|