1
|
Sun Y, Wu F, Guo H, Li R, Yao J, Shen J. TeaDiseaseNet: multi-scale self-attentive tea disease detection. Front Plant Sci 2023; 14:1257212. [PMID: 37900761 PMCID: PMC10600390 DOI: 10.3389/fpls.2023.1257212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 09/19/2023] [Indexed: 10/31/2023]
Abstract
Accurate detection of tea diseases is essential for optimizing tea yield and quality, improving production, and minimizing economic losses. In this paper, we introduce TeaDiseaseNet, a novel disease detection method designed to address the challenges in tea disease detection, such as variability in disease scales and dense, obscuring disease patterns. TeaDiseaseNet utilizes a multi-scale self-attention mechanism to enhance disease detection performance. Specifically, it incorporates a CNN-based module for extracting features at multiple scales, effectively capturing localized information such as texture and edges. This approach enables a comprehensive representation of tea images. Additionally, a self-attention module captures global dependencies among pixels, facilitating effective interaction between global information and local features. Furthermore, we integrate a channel attention mechanism, which selectively weighs and combines the multi-scale features, eliminating redundant information and enabling precise localization and recognition of tea disease information across diverse scales and complex backgrounds. Extensive comparative experiments and ablation studies validate the effectiveness of the proposed method, demonstrating superior detection results in scenarios characterized by complex backgrounds and varying disease scales. The presented method provides valuable insights for intelligent tea disease diagnosis, with significant potential for improving tea disease management and production.
Collapse
Affiliation(s)
- Yange Sun
- School of Computer and Information Technology, Xinyang Normal University, Xinyang, China
- Research Center of Precision Sensing and Control, Institute of Automation, Chinese Academy of Sciences, Beijing, China
| | - Fei Wu
- School of Computer and Information Technology, Xinyang Normal University, Xinyang, China
| | - Huaping Guo
- School of Computer and Information Technology, Xinyang Normal University, Xinyang, China
- Research Center of Precision Sensing and Control, Institute of Automation, Chinese Academy of Sciences, Beijing, China
| | - Ran Li
- School of Computer and Information Technology, Xinyang Normal University, Xinyang, China
| | - Jianfeng Yao
- School of Computer and Information Technology, Xinyang Normal University, Xinyang, China
- Henan Key Laboratory of Tea Plant Biology, Xinyang Normal University, Xinyang, China
| | - Jianbo Shen
- Intelligent Equipment Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| |
Collapse
|
2
|
Liu H, Li Z, Lin S, Cheng L. A Residual UNet Denoising Network Based on Multi-Scale Feature Extraction and Attention-Guided Filter. Sensors (Basel) 2023; 23:7044. [PMID: 37631582 PMCID: PMC10459023 DOI: 10.3390/s23167044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 08/03/2023] [Accepted: 08/05/2023] [Indexed: 08/27/2023]
Abstract
In order to obtain high-quality images, it is very important to remove noise effectively and retain image details reasonably. In this paper, we propose a residual UNet denoising network that adds the attention-guided filter and multi-scale feature extraction blocks. We design a multi-scale feature extraction block as the input block to expand the receiving domain and extract more useful features. We also develop the attention-guided filter block to hold the edge information. Further, we use the global residual network strategy to model residual noise instead of directly modeling clean images. Experimental results show our proposed network performs favorably against several state-of-the-art models. Our proposed model can not only suppress the noise more effectively, but also improve the sharpness of the image.
Collapse
Affiliation(s)
- Hualin Liu
- School of Mathematics and Statistics, Changchun University of Science and Technology, Changchun 130022, China; (H.L.); (S.L.); (L.C.)
- Laboratory of Remote Sensing Technology and Big Data Analysis, Zhongshan Research Institute, Changchun University of Science and Technology, Zhongshan 528437, China
| | - Zhe Li
- School of Mathematics and Statistics, Changchun University of Science and Technology, Changchun 130022, China; (H.L.); (S.L.); (L.C.)
- Laboratory of Remote Sensing Technology and Big Data Analysis, Zhongshan Research Institute, Changchun University of Science and Technology, Zhongshan 528437, China
| | - Shijie Lin
- School of Mathematics and Statistics, Changchun University of Science and Technology, Changchun 130022, China; (H.L.); (S.L.); (L.C.)
| | - Libo Cheng
- School of Mathematics and Statistics, Changchun University of Science and Technology, Changchun 130022, China; (H.L.); (S.L.); (L.C.)
- Laboratory of Remote Sensing Technology and Big Data Analysis, Zhongshan Research Institute, Changchun University of Science and Technology, Zhongshan 528437, China
| |
Collapse
|
3
|
Lu T, Gao M, Wang L. Crop classification in high-resolution remote sensing images based on multi-scale feature fusion semantic segmentation model. Front Plant Sci 2023; 14:1196634. [PMID: 37593043 PMCID: PMC10428625 DOI: 10.3389/fpls.2023.1196634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/23/2023] [Accepted: 07/06/2023] [Indexed: 08/19/2023]
Abstract
The great success of deep learning in the field of computer vision provides a development opportunity for intelligent information extraction of remote sensing images. In the field of agriculture, a large number of deep convolutional neural networks have been applied to crop spatial distribution recognition. In this paper, crop mapping is defined as a semantic segmentation problem, and a multi-scale feature fusion semantic segmentation model MSSNet is proposed for crop recognition, aiming at the key problem that multi-scale neural networks can learn multiple features under different sensitivity fields to improve classification accuracy and fine-grained image classification. Firstly, the network uses multi-branch asymmetric convolution and dilated convolution. Each branch concatenates conventional convolution with convolution nuclei of different sizes with dilated convolution with different expansion coefficients. Then, the features extracted from each branch are spliced to achieve multi-scale feature fusion. Finally, a skip connection is used to combine low-level features from the shallow network with abstract features from the deep network to further enrich the semantic information. In the experiment of crop classification using Sentinel-2 remote sensing image, it was found that the method made full use of spectral and spatial characteristics of crop, achieved good recognition effect. The output crop classification mapping was better in plot segmentation and edge characterization of ground objects. This study can provide a good reference for high-precision crop mapping and field plot extraction, and at the same time, avoid excessive data acquisition and processing.
Collapse
Affiliation(s)
- Tingyu Lu
- College of Geographical Sciences, Harbin Normal University, Harbin, China
| | - Meixiang Gao
- Department of Geography and Spatial Information Techniques, Ningbo University, Ningbo, China
- School of Civil and Environmental Engineering and Geography Science, Ningbo University, Ningbo, China
| | - Lei Wang
- Department of Surveying Engineering, Heilongjiang Institute of Technology, Harbin, China
| |
Collapse
|
4
|
Pan H, Yang H, Xie L, Wang Z. Multi-scale fusion visual attention network for facial micro-expression recognition. Front Neurosci 2023; 17:1216181. [PMID: 37575295 PMCID: PMC10412924 DOI: 10.3389/fnins.2023.1216181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Accepted: 06/26/2023] [Indexed: 08/15/2023] Open
Abstract
Introduction Micro-expressions are facial muscle movements that hide genuine emotions. In response to the challenge of micro-expression low-intensity, recent studies have attempted to locate localized areas of facial muscle movement. However, this ignores the feature redundancy caused by the inaccurate locating of the regions of interest. Methods This paper proposes a novel multi-scale fusion visual attention network (MFVAN), which learns multi-scale local attention weights to mask regions of redundancy features. Specifically, this model extracts the multi-scale features of the apex frame in the micro-expression video clips by convolutional neural networks. The attention mechanism focuses on the weights of local region features in the multi-scale feature maps. Then, we mask operate redundancy regions in multi-scale features and fuse local features with high attention weights for micro-expression recognition. The self-supervision and transfer learning reduce the influence of individual identity attributes and increase the robustness of multi-scale feature maps. Finally, the multi-scale classification loss, mask loss, and removing individual identity attributes loss joint to optimize the model. Results The proposed MFVAN method is evaluated on SMIC, CASME II, SAMM, and 3DB-Combined datasets that achieve state-of-the-art performance. The experimental results show that focusing on local at the multi-scale contributes to micro-expression recognition. Discussion This paper proposed MFVAN model is the first to combine image generation with visual attention mechanisms to solve the combination challenge problem of individual identity attribute interference and low-intensity facial muscle movements. Meanwhile, the MFVAN model reveal the impact of individual attributes on the localization of local ROIs. The experimental results show that a multi-scale fusion visual attention network contributes to micro-expression recognition.
Collapse
Affiliation(s)
- Hang Pan
- Department of Computer Science, Changzhi University, Changzhi, China
| | - Hongling Yang
- Department of Computer Science, Changzhi University, Changzhi, China
| | - Lun Xie
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
| | - Zhiliang Wang
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
| |
Collapse
|
5
|
Chen Y, Zhao M, Xu Z, Li K, Ji J. Wafer defect recognition method based on multi-scale feature fusion. Front Neurosci 2023; 17:1202985. [PMID: 37332866 PMCID: PMC10272367 DOI: 10.3389/fnins.2023.1202985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Accepted: 05/03/2023] [Indexed: 06/20/2023] Open
Abstract
Wafer defect recognition is an important process of chip manufacturing. As different process flows can lead to different defect types, the correct identification of defect patterns is important for recognizing manufacturing problems and fixing them in good time. To achieve high precision identification of wafer defects and improve the quality and production yield of wafers, this paper proposes a Multi-Feature Fusion Perceptual Network (MFFP-Net) inspired by human visual perception mechanisms. The MFFP-Net can process information at various scales and then aggregate it so that the next stage can abstract features from the different scales simultaneously. The proposed feature fusion module can obtain higher fine-grained and richer features to capture key texture details and avoid important information loss. The final experiments show that MFFP-Net achieves good generalized ability and state-of-the-art results on real-world dataset WM-811K, with an accuracy of 96.71%, this provides an effective way for the chip manufacturing industry to improve the yield rate.
Collapse
Affiliation(s)
- Yu Chen
- Research Center for Applied Mechanics, School of Electro-Mechanical Engineering, Xidian University, Xi’an, China
| | - Meng Zhao
- Research Center for Applied Mechanics, School of Electro-Mechanical Engineering, Xidian University, Xi’an, China
- Shaanxi Key Laboratory of Space Extreme Detection, Xi'an, China
| | - Zhenyu Xu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
| | - Kaiyue Li
- Research Center for Applied Mechanics, School of Electro-Mechanical Engineering, Xidian University, Xi’an, China
| | - Jing Ji
- Research Center for Applied Mechanics, School of Electro-Mechanical Engineering, Xidian University, Xi’an, China
- Shaanxi Key Laboratory of Space Extreme Detection, Xi'an, China
| |
Collapse
|
6
|
Xiao X, Xiong X, Meng F, Chen Z. Multi-Scale Feature Interactive Fusion Network for RGBT Tracking. Sensors (Basel) 2023; 23:3410. [PMID: 37050470 PMCID: PMC10098685 DOI: 10.3390/s23073410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 03/16/2023] [Accepted: 03/22/2023] [Indexed: 06/19/2023]
Abstract
The fusion tracking of RGB and thermal infrared image (RGBT) is paid wide attention to due to their complementary advantages. Currently, most algorithms obtain modality weights through attention mechanisms to integrate multi-modalities information. They do not fully exploit the multi-scale information and ignore the rich contextual information among features, which limits the tracking performance to some extent. To solve this problem, this work proposes a new multi-scale feature interactive fusion network (MSIFNet) for RGBT tracking. Specifically, we use different convolution branches for multi-scale feature extraction and aggregate them through the feature selection module adaptively. At the same time, a Transformer interactive fusion module is proposed to build long-distance dependencies and enhance semantic representation further. Finally, a global feature fusion module is designed to adjust the global information adaptively. Numerous experiments on publicly available GTOT, RGBT234, and LasHeR datasets show that our algorithm outperforms the current mainstream tracking algorithms.
Collapse
Affiliation(s)
- Xianbing Xiao
- School of Automation and Information Engineering, Sichuan University of Science and Engineering, Yibin 644000, China
| | - Xingzhong Xiong
- Artificial Intelligence Key Laboratory of Sichuan Province, Sichuan University of Science and Engineering, Yibin 644000, China
| | - Fanqin Meng
- Artificial Intelligence Key Laboratory of Sichuan Province, Sichuan University of Science and Engineering, Yibin 644000, China
| | - Zhen Chen
- School of Automation and Information Engineering, Sichuan University of Science and Engineering, Yibin 644000, China
| |
Collapse
|
7
|
Jin C, Zheng A, Wu Z, Tong C. Real-Time Fire Smoke Detection Method Combining a Self-Attention Mechanism and Radial Multi-Scale Feature Connection. Sensors (Basel) 2023; 23:3358. [PMID: 36992068 PMCID: PMC10054114 DOI: 10.3390/s23063358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Revised: 03/10/2023] [Accepted: 03/20/2023] [Indexed: 06/19/2023]
Abstract
Fire remains a pressing issue that requires urgent attention. Due to its uncontrollable and unpredictable nature, it can easily trigger chain reactions and increase the difficulty of extinguishing, posing a significant threat to people's lives and property. The effectiveness of traditional photoelectric- or ionization-based detectors is inhibited when detecting fire smoke due to the variable shape, characteristics, and scale of the detected objects and the small size of the fire source in the early stages. Additionally, the uneven distribution of fire and smoke and the complexity and variety of the surroundings in which they occur contribute to inconspicuous pixel-level-based feature information, making identification difficult. We propose a real-time fire smoke detection algorithm based on multi-scale feature information and an attention mechanism. Firstly, the feature information layers extracted from the network are fused into a radial connection to enhance the semantic and location information of the features. Secondly, to address the challenge of recognizing harsh fire sources, we designed a permutation self-attention mechanism to concentrate on features in channel and spatial directions to gather contextual information as accurately as possible. Thirdly, we constructed a new feature extraction module to increase the detection efficiency of the network while retaining feature information. Finally, we propose a cross-grid sample matching approach and a weighted decay loss function to handle the issue of imbalanced samples. Our model achieves the best detection results compared to standard detection methods using a handcrafted fire smoke detection dataset, with APval reaching 62.5%, APSval reaching 58.5%, and FPS reaching 113.6.
Collapse
Affiliation(s)
- Chuan Jin
- School of Sciences, Hangzhou Dianzi University, Hangzhou 310018, China; (A.Z.); (C.T.)
| | - Anqi Zheng
- School of Sciences, Hangzhou Dianzi University, Hangzhou 310018, China; (A.Z.); (C.T.)
| | - Zhaoying Wu
- Southeast-Monash Joint Graduate School, Southeast University, Suzhou 210096, China;
| | - Changqing Tong
- School of Sciences, Hangzhou Dianzi University, Hangzhou 310018, China; (A.Z.); (C.T.)
| |
Collapse
|
8
|
Li Y, Rao Y, Jin X, Jiang Z, Wang Y, Wang T, Wang F, Luo Q, Liu L. YOLOv5s-FP: A Novel Method for In-Field Pear Detection Using a Transformer Encoder and Multi-Scale Collaboration Perception. Sensors (Basel) 2022; 23:s23010030. [PMID: 36616628 PMCID: PMC9823628 DOI: 10.3390/s23010030] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Revised: 12/10/2022] [Accepted: 12/13/2022] [Indexed: 05/27/2023]
Abstract
Precise pear detection and recognition is an essential step toward modernizing orchard management. However, due to the ubiquitous occlusion in orchards and various locations of image acquisition, the pears in the acquired images may be quite small and occluded, causing high false detection and object loss rate. In this paper, a multi-scale collaborative perception network YOLOv5s-FP (Fusion and Perception) was proposed for pear detection, which coupled local and global features. Specifically, a pear dataset with a high proportion of small and occluded pears was proposed, comprising 3680 images acquired with cameras mounted on a ground tripod and a UAV platform. The cross-stage partial (CSP) module was optimized to extract global features through a transformer encoder, which was then fused with local features by an attentional feature fusion mechanism. Subsequently, a modified path aggregation network oriented to collaboration perception of multi-scale features was proposed by incorporating a transformer encoder, the optimized CSP, and new skip connections. The quantitative results of utilizing the YOLOv5s-FP for pear detection were compared with other typical object detection networks of the YOLO series, recording the highest average precision of 96.12% with less detection time and computational cost. In qualitative experiments, the proposed network achieved superior visual performance with stronger robustness to the changes in occlusion and illumination conditions, particularly providing the ability to detect pears with different sizes in highly dense, overlapping environments and non-normal illumination areas. Therefore, the proposed YOLOv5s-FP network was practicable for detecting in-field pears in a real-time and accurate way, which could be an advantageous component of the technology for monitoring pear growth status and implementing automated harvesting in unmanned orchards.
Collapse
Affiliation(s)
- Yipu Li
- College of Information and Computer Science, Anhui Agricultural University, Hefei 230036, China
- Key Laboratory of Agricultural Sensors, Ministry of Agriculture and Rural Affairs, Hefei 230036, China
- Anhui Provincial Key Laboratory of Smart Agricultural Technology and Equipment, Hefei 230036, China
| | - Yuan Rao
- College of Information and Computer Science, Anhui Agricultural University, Hefei 230036, China
- Key Laboratory of Agricultural Sensors, Ministry of Agriculture and Rural Affairs, Hefei 230036, China
- Anhui Provincial Key Laboratory of Smart Agricultural Technology and Equipment, Hefei 230036, China
| | - Xiu Jin
- College of Information and Computer Science, Anhui Agricultural University, Hefei 230036, China
- Key Laboratory of Agricultural Sensors, Ministry of Agriculture and Rural Affairs, Hefei 230036, China
- Anhui Provincial Key Laboratory of Smart Agricultural Technology and Equipment, Hefei 230036, China
| | - Zhaohui Jiang
- College of Information and Computer Science, Anhui Agricultural University, Hefei 230036, China
- Key Laboratory of Agricultural Sensors, Ministry of Agriculture and Rural Affairs, Hefei 230036, China
- Anhui Provincial Key Laboratory of Smart Agricultural Technology and Equipment, Hefei 230036, China
| | - Yuwei Wang
- Key Laboratory of Agricultural Sensors, Ministry of Agriculture and Rural Affairs, Hefei 230036, China
- Anhui Provincial Key Laboratory of Smart Agricultural Technology and Equipment, Hefei 230036, China
| | - Tan Wang
- College of Information and Computer Science, Anhui Agricultural University, Hefei 230036, China
- Key Laboratory of Agricultural Sensors, Ministry of Agriculture and Rural Affairs, Hefei 230036, China
- Anhui Provincial Key Laboratory of Smart Agricultural Technology and Equipment, Hefei 230036, China
| | - Fengyi Wang
- College of Information and Computer Science, Anhui Agricultural University, Hefei 230036, China
- Anhui Provincial Key Laboratory of Smart Agricultural Technology and Equipment, Hefei 230036, China
| | - Qing Luo
- College of Information and Computer Science, Anhui Agricultural University, Hefei 230036, China
- Anhui Provincial Key Laboratory of Smart Agricultural Technology and Equipment, Hefei 230036, China
| | - Lu Liu
- Key Laboratory of Agricultural Sensors, Ministry of Agriculture and Rural Affairs, Hefei 230036, China
- Anhui Provincial Key Laboratory of Smart Agricultural Technology and Equipment, Hefei 230036, China
| |
Collapse
|
9
|
Ding D, Li Y, Zhao P, Li K, Jiang S, Liu Y. Single Infrared Image Stripe Removal via Residual Attention Network. Sensors (Basel) 2022; 22:8734. [PMID: 36433332 PMCID: PMC9698763 DOI: 10.3390/s22228734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 11/07/2022] [Accepted: 11/10/2022] [Indexed: 06/16/2023]
Abstract
The non-uniformity of the readout circuit response in the infrared focal plane array unit detector can result in fixed pattern noise with stripe, which seriously affects the quality of the infrared images. Considering the problems of existing non-uniformity correction, such as the loss of image detail and edge blurring, a multi-scale residual network with attention mechanism is proposed for single infrared image stripe noise removal. A multi-scale feature representation module is designed to decompose the original image into varying scales to obtain more image information. The product of the direction structure similarity parameter and the Gaussian weighted Mahalanobis distance is used as the similarity metric; a channel spatial attention mechanism based on similarity (CSAS) ensures the extraction of a more discriminative channel and spatial feature. The method is employed to eliminate the stripe noise in the vertical and horizontal directions, respectively, while preserving the edge texture information of the image. The experimental results show that the proposed method outperforms four state-of-the-art methods by a large margin in terms of the qualitative and quantitative assessments. One hundred infrared images with different simulated noise intensities are applied to verify the performance of our method, and the result shows that the average peak signal-to-noise ratio and average structural similarity of the corrected image exceed 40.08 dB and 0.98, respectively.
Collapse
Affiliation(s)
- Dan Ding
- College of Physics, Changchun University of Science and Technology, Changchun 130022, China
| | - Ye Li
- College of Physics, Changchun University of Science and Technology, Changchun 130022, China
| | - Peng Zhao
- College of Physics, Changchun University of Science and Technology, Changchun 130022, China
| | - Kaitai Li
- College of Physics, Changchun University of Science and Technology, Changchun 130022, China
| | - Sheng Jiang
- College of Physics, Changchun University of Science and Technology, Changchun 130022, China
| | - Yanxiu Liu
- College of Physics, Changchun University of Science and Technology, Changchun 130022, China
- College of Electronic Information Engineering, Changchun University, Changchun 130022, China
| |
Collapse
|
10
|
Zhang D, Han H, Du S, Zhu L, Yang J, Wang X, Wang L, Xu M. MPMR: Multi-Scale Feature and Probability Map for Melanoma Recognition. Front Med (Lausanne) 2022; 8:775587. [PMID: 35071264 PMCID: PMC8766801 DOI: 10.3389/fmed.2021.775587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 12/08/2021] [Indexed: 11/13/2022] Open
Abstract
Malignant melanoma (MM) recognition in whole-slide images (WSIs) is challenging due to the huge image size of billions of pixels and complex visual characteristics. We propose a novel automatic melanoma recognition method based on the multi-scale features and probability map, named MPMR. First, we introduce the idea of breaking up the WSI into patches to overcome the difficult-to-calculate problem of WSIs with huge sizes. Second, to obtain and visualize the recognition result of MM tissues in WSIs, a probability mapping method is proposed to generate the mask based on predicted categories, confidence probabilities, and location information of patches. Third, considering that the pathological features related to melanoma are at different scales, such as tissue, cell, and nucleus, and to enhance the representation of multi-scale features is important for melanoma recognition, we construct a multi-scale feature fusion architecture by additional branch paths and shortcut connections, which extracts the enriched lesion features from low-level features containing more detail information and high-level features containing more semantic information. Fourth, to improve the extraction feature of the irregular-shaped lesion and focus on essential features, we reconstructed the residual blocks by a deformable convolution and channel attention mechanism, which further reduces information redundancy and noisy features. The experimental results demonstrate that the proposed method outperforms the compared algorithms, and it has a potential for practical applications in clinical diagnosis.
Collapse
Affiliation(s)
- Dong Zhang
- Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an, China.,School of Automation Science and Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Hongcheng Han
- Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an, China.,School of Software Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Shaoyi Du
- Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an, China
| | - Longfei Zhu
- Dermatology Department, Second Affiliated Hospital of Xi'an Jiaotong University (Xibei Hospital), Xi'an, China
| | - Jing Yang
- School of Software Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Xijing Wang
- Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an, China
| | - Lin Wang
- School of Information Science and Technology, Northwest University, Xi'an, China
| | - Meifeng Xu
- Dermatology Department, Second Affiliated Hospital of Xi'an Jiaotong University (Xibei Hospital), Xi'an, China
| |
Collapse
|
11
|
Yan Q, Wang B, Gong D, Luo C, Zhao W, Shen J, Ai J, Shi Q, Zhang Y, Jin S, Zhang L, You Z. COVID-19 Chest CT Image Segmentation Network by Multi-Scale Fusion and Enhancement Operations. IEEE Trans Big Data 2021; 7:13-24. [PMID: 36811064 PMCID: PMC8769014 DOI: 10.1109/tbdata.2021.3056564] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Revised: 01/07/2021] [Accepted: 01/27/2021] [Indexed: 05/08/2023]
Abstract
A novel coronavirus disease 2019 (COVID-19) was detected and has spread rapidly across various countries around the world since the end of the year 2019. Computed Tomography (CT) images have been used as a crucial alternative to the time-consuming RT-PCR test. However, pure manual segmentation of CT images faces a serious challenge with the increase of suspected cases, resulting in urgent requirements for accurate and automatic segmentation of COVID-19 infections. Unfortunately, since the imaging characteristics of the COVID-19 infection are diverse and similar to the backgrounds, existing medical image segmentation methods cannot achieve satisfactory performance. In this article, we try to establish a new deep convolutional neural network tailored for segmenting the chest CT images with COVID-19 infections. We first maintain a large and new chest CT image dataset consisting of 165,667 annotated chest CT images from 861 patients with confirmed COVID-19. Inspired by the observation that the boundary of the infected lung can be enhanced by adjusting the global intensity, in the proposed deep CNN, we introduce a feature variation block which adaptively adjusts the global properties of the features for segmenting COVID-19 infection. The proposed FV block can enhance the capability of feature representation effectively and adaptively for diverse cases. We fuse features at different scales by proposing Progressive Atrous Spatial Pyramid Pooling to handle the sophisticated infection areas with diverse appearance and shapes. The proposed method achieves state-of-the-art performance. Dice similarity coefficients are 0.987 and 0.726 for lung and COVID-19 segmentation, respectively. We conducted experiments on the data collected in China and Germany and show that the proposed deep CNN can produce impressive performance effectively. The proposed network enhances the segmentation ability of the COVID-19 infection, makes the connection with other techniques and contributes to the development of remedying COVID-19 infection.
Collapse
Affiliation(s)
- Qingsen Yan
- Australian Institute for Machine LearningUniversity of Adelaide Adelaide SA 5005 Australia
| | - Bo Wang
- State Key Laboratory of Precision Measurement Technology and Instruments, Department of Precision Instrument, Innovation Center for Future ChipsTsinghua University (THU) Beijing 100084 China
- Beijing Jingzhen Medical Technology Ltd. Beijing 100015 China
| | - Dong Gong
- Australian Institute for Machine LearningUniversity of Adelaide Adelaide SA 5005 Australia
| | - Chuan Luo
- State Key Laboratory of Precision Measurement Technology and InstrumentsTsinghua University Beijing 100084 China
| | - Wei Zhao
- Beijing Jingzhen Medical Technology Ltd. Beijing 100015 China
| | - Jianhu Shen
- Beijing Jingzhen Medical Technology Ltd. Beijing 100015 China
| | - Jingyang Ai
- Beijing Jingzhen Medical Technology Ltd. Beijing 100015 China
| | - Qinfeng Shi
- Australian Institute for Machine LearningUniversity of Adelaide Adelaide SA 5005 Australia
| | - Yanning Zhang
- School of Computer ScienceNorthwestern Polytechnical University Xi'an 710072 China
| | - Shuo Jin
- Beijing Tsinghua Changgung Hospital, School of Clinical MedicineTsinghua University Beijing 100084 China
| | - Liang Zhang
- School of Computer Science and TechnologyXidian University Xi'an 710071 China
| | - Zheng You
- State Key Laboratory of Precision Measurement Technology and Instruments, Department of Precision Instrument, Innovation Center for Future ChipsTsinghua University (THU) Beijing 100084 China
| |
Collapse
|
12
|
Abstract
Shortage of fully annotated datasets has been a limiting factor in developing deep learning based image segmentation algorithms and the problem becomes more pronounced in multi-organ segmentation. In this paper, we propose a unified training strategy that enables a novel multi-scale deep neural network to be trained on multiple partially labeled datasets for multi-organ segmentation. In addition, a new network architecture for multi-scale feature abstraction is proposed to integrate pyramid input and feature analysis into a U-shape pyramid structure. To bridge the semantic gap caused by directly merging features from different scales, an equal convolutional depth mechanism is introduced. Furthermore, we employ a deep supervision mechanism to refine the outputs in different scales. To fully leverage the segmentation features from all the scales, we design an adaptive weighting layer to fuse the outputs in an automatic fashion. All these mechanisms together are integrated into a Pyramid Input Pyramid Output Feature Abstraction Network (PIPO-FAN). Our proposed method was evaluated on four publicly available datasets, including BTCV, LiTS, KiTS and Spleen, where very promising performance has been achieved. The source code of this work is publicly shared at https://github.com/DIAL-RPI/PIPO-FAN to facilitate others to reproduce the work and build their own models using the introduced mechanisms.
Collapse
|