1
|
Zeng B, Zhou Y, He D, Zhou Z, Hao S, Yi K, Li Z, Zhang W, Xie Y. Research on Lightweight Method of Insulator Target Detection Based on Improved SSD. SENSORS (BASEL, SWITZERLAND) 2024; 24:5910. [PMID: 39338655 PMCID: PMC11435894 DOI: 10.3390/s24185910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Revised: 08/29/2024] [Accepted: 09/06/2024] [Indexed: 09/30/2024]
Abstract
Aiming at the problems of a large volume, slow processing speed, and difficult deployment in the edge terminal, this paper proposes a lightweight insulator detection algorithm based on an improved SSD. Firstly, the original feature extraction network VGG-16 is replaced by a lightweight Ghost Module network to initially achieve the lightweight model. A Feature Pyramid structure and Feature Pyramid Network (FPN+PAN) are integrated into the Neck part and a Simplified Spatial Pyramid Pooling Fast (SimSPPF) module is introduced to realize the integration of local features and global features. Secondly, multiple Spatial and Channel Squeeze-and-Excitation (scSE) attention mechanisms are introduced in the Neck part to make the model pay more attention to the channels containing important feature information. The original six detection heads are reduced to four to improve the inference speed of the network. In order to improve the recognition performance of occluded and overlapping targets, DIoU-NMS was used to replace the original non-maximum suppression (NMS). Furthermore, the channel pruning strategy is used to reduce the unimportant weight matrix of the model, and the knowledge distillation strategy is used to fine-adjust the network model after pruning, so as to ensure the detection accuracy. The experimental results show that the parameter number of the proposed model is reduced from 26.15 M to 0.61 M, the computational load is reduced from 118.95 G to 1.49 G, and the mAP is increased from 96.8% to 98%. Compared with other models, the proposed model not only guarantees the detection accuracy of the algorithm, but also greatly reduces the model volume, which provides support for the realization of visible light insulator target detection based on edge intelligence.
Collapse
Affiliation(s)
- Bing Zeng
- Nanchang Institute of Technology, Nanchang 330099, China
| | - Yu Zhou
- Nanchang Institute of Technology, Nanchang 330099, China
| | - Dilin He
- Nanchang Institute of Technology, Nanchang 330099, China
| | - Zhihao Zhou
- Nanchang Institute of Technology, Nanchang 330099, China
| | - Shitao Hao
- Nanchang Institute of Technology, Nanchang 330099, China
| | - Kexin Yi
- Nanchang Institute of Technology, Nanchang 330099, China
| | - Zhilong Li
- State Grid Shanghai Municipal Electric Power Company Maintenance Company, Shanghai 200063, China
| | - Wenhua Zhang
- Nanchang Institute of Technology, Nanchang 330099, China
| | - Yunmin Xie
- Nanchang Institute of Technology, Nanchang 330099, China
| |
Collapse
|
2
|
Zhang Y, Freris NM. Adaptive Filter Pruning via Sensitivity Feedback. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:10996-11008. [PMID: 37028336 DOI: 10.1109/tnnls.2023.3246263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Filter pruning is advocated for accelerating deep neural networks without dedicated hardware or libraries, while maintaining high prediction accuracy. Several works have cast pruning as a variant of l1 -regularized training, which entails two challenges: 1) the l1 -norm is not scaling-invariant (i.e., the regularization penalty depends on weight values) and 2) there is no rule for selecting the penalty coefficient to trade off high pruning ratio for low accuracy drop. To address these issues, we propose a lightweight pruning method termed adaptive sensitivity-based pruning (ASTER) which: 1) achieves scaling-invariance by refraining from modifying unpruned filter weights and 2) dynamically adjusts the pruning threshold concurrently with the training process. ASTER computes the sensitivity of the loss to the threshold on the fly (without retraining); this is carried efficiently by an application of L-BFGS solely on the batch normalization (BN) layers. It then proceeds to adapt the threshold so as to maintain a fine balance between pruning ratio and model capacity. We have conducted extensive experiments on a number of state-of-the-art CNN models on benchmark datasets to illustrate the merits of our approach in terms of both FLOPs reduction and accuracy. For example, on ILSVRC-2012 our method reduces more than 76% FLOPs for ResNet-50 with only 2.0% Top-1 accuracy degradation, while for the MobileNet v2 model it achieves 46.6% FLOPs Drop with a Top-1 Acc. Drop of only 2.77%. Even for a very lightweight classification model like MobileNet v3-small, ASTER saves 16.1% FLOPs with a negligible Top-1 accuracy drop of 0.03%.
Collapse
|
3
|
Qian Y, He Z, Wang Y, Wang B, Ling X, Gu Z, Wang H, Zeng S, Swaileh W. Hierarchical Threshold Pruning Based on Uniform Response Criterion. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:10869-10881. [PMID: 37071515 DOI: 10.1109/tnnls.2023.3244994] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Convolutional neural networks (CNNs) have been successfully applied to various fields. However, CNNs' overparameterization requires more memory and training time, making it unsuitable for some resource-constrained devices. To address this issue, filter pruning as one of the most efficient ways was proposed. In this article, we propose a feature-discrimination-based filter importance criterion, uniform response criterion (URC), as a key component of filter pruning. It converts the maximum activation responses into probabilities and then measures the importance of the filter through the distribution of these probabilities over classes. However, applying URC directly to global threshold pruning may cause some problems. The first problem is that some layers will be completely pruned under global pruning settings. The second problem is that global threshold pruning neglects that filters in different layers have different importance. To address these issues, we propose hierarchical threshold pruning (HTP) with URC. It performs a pruning step limited in a relatively redundant layer rather than comparing the filters' importance across all layers, which can avoid some important filters being pruned. The effectiveness of our method benefits from three techniques: 1) measuring filter importance by URC; 2) normalizing filter scores; and 3) conducting prune in relatively redundant layers. Extensive experiments on CIFAR-10/100 and ImageNet show that our method achieves the state-of-the-art performance on multiple benchmarks.
Collapse
|
4
|
Geng X, Gao J, Zhang Y, Xu D. Complex hybrid weighted pruning method for accelerating convolutional neural networks. Sci Rep 2024; 14:5570. [PMID: 38448451 PMCID: PMC10917793 DOI: 10.1038/s41598-024-55942-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Accepted: 02/29/2024] [Indexed: 03/08/2024] Open
Abstract
The increasing interest in filter pruning of convolutional neural networks stems from its inherent ability to effectively compress and accelerate these networks. Currently, filter pruning is mainly divided into two schools: norm-based and relation-based. These methods aim to selectively remove the least important filters according to predefined rules. However, the limitations of these methods lie in the inadequate consideration of filter diversity and the impact of batch normalization (BN) layers on the input of the next layer, which may lead to performance degradation. To address the above limitations of norm-based and similarity-based methods, this study conducts empirical analyses to reveal their drawbacks and subsequently introduces a groundbreaking complex hybrid weighted pruning method. By evaluating the correlations and norms between individual filters, as well as the parameters of the BN layer, our method effectively identifies and prunes the most redundant filters in a robust manner, thereby avoiding significant decreases in network performance. We conducted comprehensive and direct pruning experiments on different depths of ResNet using publicly available image classification datasets, ImageNet and CIFAR-10. The results demonstrate the significant efficacy of our approach. In particular, when applied to the ResNet-50 on the ImageNet dataset, achieves a significant reduction of 53.5% in floating-point operations, with a performance loss of only 0.6%.
Collapse
Affiliation(s)
- Xu Geng
- School of Information and Communication Engineering, Hainan University, Haikou, 570228, China
| | - Jinxiong Gao
- School of Information and Communication Engineering, Hainan University, Haikou, 570228, China
| | - Yonghui Zhang
- School of Information and Communication Engineering, Hainan University, Haikou, 570228, China.
| | - Dingtan Xu
- School of Information and Communication Engineering, Hainan University, Haikou, 570228, China
| |
Collapse
|
5
|
Zheng YJ, Chen SB, Ding CHQ, Luo B. Model Compression Based on Differentiable Network Channel Pruning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:10203-10212. [PMID: 35427225 DOI: 10.1109/tnnls.2022.3165123] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Although neural networks have achieved great success in various fields, applications on mobile devices are limited by the computational and storage costs required for large models. The model compression (neural network pruning) technology can significantly reduce network parameters and improve computational efficiency. In this article, we propose a differentiable network channel pruning (DNCP) method for model compression. Unlike existing methods that require sampling and evaluation of a large number of substructures, our method can efficiently search for optimal substructure that meets resource constraints (e.g., FLOPs) through gradient descent. Specifically, we assign a learnable probability to each possible number of channels in each layer of the network, relax the selection of a particular number of channels to a softmax over all possible numbers of channels, and optimize the learnable probability in an end-to-end manner through gradient descent. After the network parameters are optimized, we prune the network according to the learnable probability to obtain the optimal substructure. To demonstrate the effectiveness and efficiency of DNCP, experiments are conducted with ResNet and MobileNet V2 on CIFAR, Tiny ImageNet, and ImageNet datasets.
Collapse
|
6
|
Leng J, Chen X, Zhao J, Wang C, Zhu J, Yan Y, Zhao J, Shi W, Zhu Z, Jiang X, Lou Y, Feng C, Yang Q, Xu F. A Light Vehicle License-Plate-Recognition System Based on Hybrid Edge-Cloud Computing. SENSORS (BASEL, SWITZERLAND) 2023; 23:8913. [PMID: 37960612 PMCID: PMC10650870 DOI: 10.3390/s23218913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 10/06/2023] [Accepted: 10/10/2023] [Indexed: 11/15/2023]
Abstract
With the world moving towards low-carbon and environmentally friendly development, the rapid growth of new-energy vehicles is evident. The utilization of deep-learning-based license-plate-recognition (LPR) algorithms has become widespread. However, existing LPR systems have difficulty achieving timely, effective, and energy-saving recognition due to their inherent limitations such as high latency and energy consumption. An innovative Edge-LPR system that leverages edge computing and lightweight network models is proposed in this paper. With the help of this technology, the excessive reliance on the computational capacity and the uneven implementation of resources of cloud computing can be successfully mitigated. The system is specifically a simple LPR. Channel pruning was used to reconstruct the backbone layer, reduce the network model parameters, and effectively reduce the GPU resource consumption. By utilizing the computing resources of the Intel second-generation computing stick, the network models were deployed on edge gateways to detect license plates directly. The reliability and effectiveness of the Edge-LPR system were validated through the experimental analysis of the CCPD standard dataset and real-time monitoring dataset from charging stations. The experimental results from the CCPD common dataset demonstrated that the network's total number of parameters was only 0.606 MB, with an impressive accuracy rate of 97%.
Collapse
Affiliation(s)
- Jiancai Leng
- International School of Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, Changqing District, Jinan 250300, China; (J.L.); (X.C.); (J.Z.); (C.W.); (J.Z.); (Y.Y.); (J.Z.); (W.S.); (Z.Z.); (X.J.); (Y.L.); (C.F.)
| | - Xinyi Chen
- International School of Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, Changqing District, Jinan 250300, China; (J.L.); (X.C.); (J.Z.); (C.W.); (J.Z.); (Y.Y.); (J.Z.); (W.S.); (Z.Z.); (X.J.); (Y.L.); (C.F.)
| | - Jinzhao Zhao
- International School of Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, Changqing District, Jinan 250300, China; (J.L.); (X.C.); (J.Z.); (C.W.); (J.Z.); (Y.Y.); (J.Z.); (W.S.); (Z.Z.); (X.J.); (Y.L.); (C.F.)
| | - Chongfeng Wang
- International School of Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, Changqing District, Jinan 250300, China; (J.L.); (X.C.); (J.Z.); (C.W.); (J.Z.); (Y.Y.); (J.Z.); (W.S.); (Z.Z.); (X.J.); (Y.L.); (C.F.)
| | - Jianqun Zhu
- International School of Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, Changqing District, Jinan 250300, China; (J.L.); (X.C.); (J.Z.); (C.W.); (J.Z.); (Y.Y.); (J.Z.); (W.S.); (Z.Z.); (X.J.); (Y.L.); (C.F.)
| | - Yihao Yan
- International School of Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, Changqing District, Jinan 250300, China; (J.L.); (X.C.); (J.Z.); (C.W.); (J.Z.); (Y.Y.); (J.Z.); (W.S.); (Z.Z.); (X.J.); (Y.L.); (C.F.)
| | - Jiaqi Zhao
- International School of Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, Changqing District, Jinan 250300, China; (J.L.); (X.C.); (J.Z.); (C.W.); (J.Z.); (Y.Y.); (J.Z.); (W.S.); (Z.Z.); (X.J.); (Y.L.); (C.F.)
| | - Weiyou Shi
- International School of Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, Changqing District, Jinan 250300, China; (J.L.); (X.C.); (J.Z.); (C.W.); (J.Z.); (Y.Y.); (J.Z.); (W.S.); (Z.Z.); (X.J.); (Y.L.); (C.F.)
| | - Zhaoxin Zhu
- International School of Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, Changqing District, Jinan 250300, China; (J.L.); (X.C.); (J.Z.); (C.W.); (J.Z.); (Y.Y.); (J.Z.); (W.S.); (Z.Z.); (X.J.); (Y.L.); (C.F.)
| | - Xiuquan Jiang
- International School of Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, Changqing District, Jinan 250300, China; (J.L.); (X.C.); (J.Z.); (C.W.); (J.Z.); (Y.Y.); (J.Z.); (W.S.); (Z.Z.); (X.J.); (Y.L.); (C.F.)
| | - Yitai Lou
- International School of Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, Changqing District, Jinan 250300, China; (J.L.); (X.C.); (J.Z.); (C.W.); (J.Z.); (Y.Y.); (J.Z.); (W.S.); (Z.Z.); (X.J.); (Y.L.); (C.F.)
| | - Chao Feng
- International School of Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, Changqing District, Jinan 250300, China; (J.L.); (X.C.); (J.Z.); (C.W.); (J.Z.); (Y.Y.); (J.Z.); (W.S.); (Z.Z.); (X.J.); (Y.L.); (C.F.)
| | - Qingbo Yang
- School of Mathematics and Statistics, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, Changqing District, Jinan 250300, China
| | - Fangzhou Xu
- International School of Optoelectronic Engineering, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, Changqing District, Jinan 250300, China; (J.L.); (X.C.); (J.Z.); (C.W.); (J.Z.); (Y.Y.); (J.Z.); (W.S.); (Z.Z.); (X.J.); (Y.L.); (C.F.)
| |
Collapse
|
7
|
Gong M, Gao Y, Wu Y, Zhang Y, Qin AK, Ong YS. Heterogeneous Multi-Party Learning With Data-Driven Network Sampling. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:13328-13343. [PMID: 37379198 DOI: 10.1109/tpami.2023.3290213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/30/2023]
Abstract
Multi-party learning provides an effective approach for training a machine learning model, e.g., deep neural networks (DNNs), over decentralized data by leveraging multiple decentralized computing devices, subjected to legal and practical constraints. Different parties, so-called local participants, usually provide heterogenous data in a decentralized mode, leading to non-IID data distributions across different local participants which pose a notorious challenge for multi-party learning. To address this challenge, we propose a novel heterogeneous differentiable sampling (HDS) framework. Inspired by the dropout strategy in DNNs, a data-driven network sampling strategy is devised in the HDS framework, with differentiable sampling rates which allow each local participant to extract from a common global model the optimal local model that best adapts to its own data properties so that the size of the local model can be significantly reduced to enable more efficient inference. Meanwhile, co-adaptation of the global model via learning such local models allows for achieving better learning performance under non-IID data distributions and speeds up the convergence of the global model. Experiments have demonstrated the superiority of the proposed method over several popular multi-party learning techniques in the multi-party settings with non-IID data distributions.
Collapse
|
8
|
He Y, Liu P, Zhu L, Yang Y. Filter Pruning by Switching to Neighboring CNNs With Good Attributes. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:8044-8056. [PMID: 35180092 DOI: 10.1109/tnnls.2022.3149332] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Filter pruning is effective to reduce the computational costs of neural networks. Existing methods show that updating the previous pruned filter would enable large model capacity and achieve better performance. However, during the iterative pruning process, even if the network weights are updated to new values, the pruning criterion remains the same. In addition, when evaluating the filter importance, only the magnitude information of the filters is considered. However, in neural networks, filters do not work individually, but they would affect other filters. As a result, the magnitude information of each filter, which merely reflects the information of an individual filter itself, is not enough to judge the filter importance. To solve the above problems, we propose meta-attribute-based filter pruning (MFP). First, to expand the existing magnitude information-based pruning criteria, we introduce a new set of criteria to consider the geometric distance of filters. Additionally, to explicitly assess the current state of the network, we adaptively select the most suitable criteria for pruning via a meta-attribute, a property of the neural network at the current state. Experiments on two image classification benchmarks validate our method. For ResNet-50 on ILSVRC-2012, we could reduce more than 50% FLOPs with only 0.44% top-5 accuracy loss.
Collapse
|
9
|
Zhang X, Xie W, Li Y, Lei J, Du Q. Filter Pruning via Learned Representation Median in the Frequency Domain. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:3165-3175. [PMID: 34797771 DOI: 10.1109/tcyb.2021.3124284] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In this article, we propose a novel filter pruning method for deep learning networks by calculating the learned representation median (RM) in frequency domain (LRMF). In contrast to the existing filter pruning methods that remove relatively unimportant filters in the spatial domain, our newly proposed approach emphasizes the removal of absolutely unimportant filters in the frequency domain. Through extensive experiments, we observed that the criterion for "relative unimportance" cannot be generalized well and that the discrete cosine transform (DCT) domain can eliminate redundancy and emphasize low-frequency representation, which is consistent with the human visual system. Based on these important observations, our LRMF calculates the learned RM in the frequency domain and removes its corresponding filter, since it is absolutely unimportant at each layer. Thanks to this, the time-consuming fine-tuning process is not required in LRMF. The results show that LRMF outperforms state-of-the-art pruning methods. For example, with ResNet110 on CIFAR-10, it achieves a 52.3% FLOPs reduction with an improvement of 0.04% in Top-1 accuracy. With VGG16 on CIFAR-100, it reduces FLOPs by 35.9% while increasing accuracy by 0.5%. On ImageNet, ResNet18 and ResNet50 are accelerated by 53.3% and 52.7% with only 1.76% and 0.8% accuracy loss, respectively. The code is based on PyTorch and is available at https://github.com/zhangxin-xd/LRMF.
Collapse
|
10
|
Liu Y, Cao J, Li B, Hu W, Maybank S. Learning to Explore Distillability and Sparsability: A Joint Framework for Model Compression. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:3378-3395. [PMID: 35731774 DOI: 10.1109/tpami.2022.3185317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Deep learning shows excellent performance usually at the expense of heavy computation. Recently, model compression has become a popular way of reducing the computation. Compression can be achieved using knowledge distillation or filter pruning. Knowledge distillation improves the accuracy of a lightweight network, while filter pruning removes redundant architecture in a cumbersome network. They are two different ways of achieving model compression, but few methods simultaneously consider both of them. In this paper, we revisit model compression and define two attributes of a model: distillability and sparsability, which reflect how much useful knowledge can be distilled and how many pruned ratios can be obtained, respectively. Guided by our observations and considering both accuracy and model size, a dynamically distillability-and-sparsability learning framework (DDSL) is introduced for model compression. DDSL consists of teacher, student and dean. Knowledge is distilled from the teacher to guide the student. The dean controls the training process by dynamically adjusting the distillation supervision and the sparsity supervision in a meta-learning framework. An alternating direction method of multiplier (ADMM)-based knowledge distillation-with-pruning (KDP) joint optimization algorithm is proposed to train the model. Extensive experimental results show that DDSL outperforms 24 state-of-the-art methods, including both knowledge distillation and filter pruning methods.
Collapse
|
11
|
Li K, Wang B, Tian Y, Qi Z. Fast and Accurate Road Crack Detection Based on Adaptive Cost-Sensitive Loss Function. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:1051-1062. [PMID: 34546935 DOI: 10.1109/tcyb.2021.3103885] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Numerous detection problems in computer vision, including road crack detection, suffer from exceedingly foreground-background imbalance. Fortunately, modification of loss function appears to solve this puzzle once and for all. In this article, we propose a pixel-based adaptive weighted cross-entropy (WCE) loss in conjunction with Jaccard distance to facilitate high-quality pixel-level road crack detection. Our work profoundly demonstrates the influence of loss functions on detection outcomes and sheds light on the sophisticated consecutive improvements in the realm of crack detection. Specifically, to verify the effectiveness of the proposed loss, we conduct extensive experiments on four public databases, that is, CrackForest, AigleRN, Crack360, and BJN260. Compared to the vanilla WCE, the proposed loss significantly speeds up the training process while retaining the performance.
Collapse
|
12
|
Regularization-based pruning of irrelevant weights in deep neural architectures. APPL INTELL 2023. [DOI: 10.1007/s10489-022-04353-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
|
13
|
Deep learning-based important weights-only transfer learning approach for COVID-19 CT-scan classification. APPL INTELL 2023; 53:7201-7215. [PMID: 35875199 PMCID: PMC9289654 DOI: 10.1007/s10489-022-03893-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/13/2022] [Indexed: 11/18/2022]
Abstract
COVID-19 has become a pandemic for the entire world, and it has significantly affected the world economy. The importance of early detection and treatment of the infection cannot be overstated. The traditional diagnosis techniques take more time in detecting the infection. Although, numerous deep learning-based automated solutions have recently been developed in this regard, nevertheless, the limitation of computational and battery power in resource-constrained devices makes it difficult to deploy trained models for real-time inference. In this paper, to detect the presence of COVID-19 in CT-scan images, an important weights-only transfer learning method has been proposed for devices with limited runt-time resources. In the proposed method, the pre-trained models are made point-of-care devices friendly by pruning less important weight parameters of the model. The experiments were performed on two popular VGG16 and ResNet34 models and the empirical results showed that pruned ResNet34 model achieved 95.47% accuracy, 0.9216 sensitivity, 0.9567 F-score, and 0.9942 specificity with 41.96% fewer FLOPs and 20.64% fewer weight parameters on the SARS-CoV-2 CT-scan dataset. The results of our experiments showed that the proposed method significantly reduces the run-time resource requirements of the computationally intensive models and makes them ready to be utilized on the point-of-care devices.
Collapse
|
14
|
Wang X, Zheng Z, He Y, Yan F, Zeng Z, Yang Y. Soft Person Reidentification Network Pruning via Blockwise Adjacent Filter Decaying. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:13293-13307. [PMID: 34910650 DOI: 10.1109/tcyb.2021.3130047] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Deep learning has shown significant successes in person reidentification (re-id) tasks. However, most existing works focus on discriminative feature learning and impose complex neural networks, suffering from low inference efficiency. In fact, feature extraction time is also crucial for real-world applications and lightweight models are needed. Prevailing pruning methods usually pay attention to compact classification models. However, these methods are suboptimal for compacting re-id models, which usually produce continuous features and are sensitive to network pruning. The key point of pruning re-id models is how to retain the original filter distribution in continuous features as much as possible. In this work, we propose a blockwise adjacent filter decaying method to fill this gap. Specifically, given a trained model, we first evaluate the redundancy of filters based on the adjacency relationships to preserve the original filter distribution. Second, previous layerwise pruning methods ignore that discriminative information is enhanced block-by-block. Therefore, we propose a blockwise filter pruning strategy to better utilize the block relations in the pretrained model. Third, we propose a novel filter decaying policy to progressively reduce the scale of redundant filters. Different from conventional soft filter pruning that directly sets the filter values as zeros, the proposed filter decaying can keep the pretrained knowledge as much as possible. We evaluate our method on three popular person reidentification datasets, that is: 1) Market-1501; 2) DukeMTMC-reID; and 3) MSMT17_V1. The proposed method shows superior performance to the existing state-of-the-art pruning methods. After pruning over 91.9% parameters on DukeMTMC-reID, the Rank-1 accuracy only drops 3.7%, demonstrating its effectiveness for compacting person reidentification.
Collapse
|
15
|
Liu B, Han Z, Chen X, Shao W, Jia H, Wang Y, Tang Y. A novel compact design of convolutional layers with spatial transformation towards lower-rank representation for image classification. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
16
|
ReLP: Reinforcement Learning Pruning Method Based on Prior Knowledge. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-11058-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
|
17
|
Ma X, Li G, Liu L, Liu H, Wang X. Accelerating deep neural network filter pruning with mask-aware convolutional computations on modern CPUs. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.07.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
18
|
A Compact Parallel Pruning Scheme for Deep Learning Model and Its Mobile Instrument Deployment. MATHEMATICS 2022. [DOI: 10.3390/math10122126] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
In the single pruning algorithm, channel pruning or filter pruning is used to compress the deep convolution neural network, and there are still many redundant parameters in the compressed model. Directly pruning the filter will largely cause the loss of key information and affect the accuracy of model classification. To solve these problems, a parallel pruning algorithm combined with image enhancement is proposed. Firstly, in order to improve the generalization ability of the model, a data enhancement method of random erasure is introduced. Secondly, according to the trained batch normalization layer scaling factor, the channels with small contribution are cut off, the model is initially thinned, and then the filters are pruned. By calculating the geometric median of the filters, redundant filters similar to them are found and pruned, and their similarity is measured by calculating the distance between filters. Pruning was done using VGG19 and DenseNet40 on cifar10 and cifar100 data sets. The experimental results show that this algorithm can improve the accuracy of the model, and at the same time, it can compress the calculation and parameters of the model to a certain extent. Finally, this method is applied in practice, and combined with transfer learning, traffic objects are classified and detected on the mobile phone.
Collapse
|
19
|
Tang Z, Li Z, Yang J, Qi F. P &GGD: A Joint-Way Model Optimization Strategy Based on Filter Pruning and Filter Grafting For Tea Leaves Classification. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-10813-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
20
|
Xu Z, Li J, Meng Y, Zhang X. CAP-YOLO: Channel Attention Based Pruning YOLO for Coal Mine Real-Time Intelligent Monitoring. SENSORS (BASEL, SWITZERLAND) 2022; 22:4331. [PMID: 35746116 PMCID: PMC9229694 DOI: 10.3390/s22124331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 05/21/2022] [Accepted: 06/02/2022] [Indexed: 06/15/2023]
Abstract
Real-time coal mine intelligent monitoring for pedestrian identifying and positioning is an important means to ensure safety in production. Traditional object detection models based on neural networks require significant computational and storage resources, which results in difficulty of deploying models on edge devices for real-time intelligent monitoring. To address the above problems, CAP-YOLO (Channel Attention based Pruning YOLO) and AEPSM (adaptive image enhancement parameter selection module) are proposed in this paper to achieve real-time intelligent analysis for coal mine surveillance videos. Firstly, DCAM (Deep Channel Attention Module) is proposed to evaluate the importance level of channels in YOLOv3. Secondly, the filters corresponding to the low importance channels are pruned to generate CAP-YOLO, which recovers the accuracy through fine-tuning. Finally, considering the lighting environments are varied in different coal mine fields, AEPSM is proposed to select parameters for CLAHE (Contrast Limited Adaptive Histogram Equalization) under different fields. Experiment results show that the weight size of CAP-YOLO is 8.3× smaller than YOLOv3, but only 7% lower than mAP, and the inference speed of CAP-YOLO is three times faster than that of YOLOv3. On NVIDIA Jetson TX2, CAP-YOLO realizes 31 FPS inference speed.
Collapse
|
21
|
Regularized discriminative broad learning system for image classification. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
22
|
Mi JX, Feng J, Huang KY. Designing efficient convolutional neural network structure: A survey. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.08.158] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
23
|
Differentiable Network Pruning via Polarization of Probabilistic Channelwise Soft Masks. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:7775419. [PMID: 35571691 PMCID: PMC9098282 DOI: 10.1155/2022/7775419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 04/04/2022] [Accepted: 04/05/2022] [Indexed: 11/30/2022]
Abstract
Channel pruning has been demonstrated as a highly effective approach to compress large convolutional neural networks. Existing differentiable channel pruning methods usually use deterministic soft masks to scale the channelwise outputs and explore an appropriate threshold on the masks to remove unimportant channels, which sometimes causes unexpected damage to the network accuracy when there are no sweet spots that clearly separate important channels from redundant ones. In this article, we introduce a new differentiable channel pruning method based on polarization of probabilistic channelwise soft masks (PPSMs). We use variational inference to approximate the posterior distributions of the masks and simultaneously exploit a polarization regularization to push the probabilistic masks towards either 0 or 1; thus, the channels with near-zero masks can be safely eliminated with little hurt on network accuracy. Our method significantly relieves the difficulty faced by the existing methods to find an appropriate threshold on the masks. The joint inference and polarization of probabilistic soft masks enable PPSM to yield better pruning results than the state of the arts. For instance, our method prunes 65.91% FLOPs of ResNet50 on the ImageNet dataset with only 0.7% model accuracy degradation.
Collapse
|
24
|
Chen SB, Zheng YJ, Q. Ding CH, Luo B. SIECP: Neural Network Channel Pruning based on Sequential Interval Estimation. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.01.053] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
25
|
|
26
|
Dynamic channel pruning via activation gates. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03383-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
27
|
Diao H, Lu Y, Deng A, Zou L, Li X, Pedrycz W. Convolutional rule inference network based on belief rule-based system using an evidential reasoning approach. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.107713] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
28
|
|
29
|
Li M, Wang Z, Shen L, Ding Q, Yu L, Jiang X. Lightweight Image Compression Based on Deep Learning. ARTIF INTELL 2022. [DOI: 10.1007/978-3-031-20497-5_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
30
|
Sarvani CH, Ghorai M, Dubey SR, Basha SHS. HRel: Filter pruning based on High Relevance between activation maps and class labels. Neural Netw 2021; 147:186-197. [PMID: 35042156 DOI: 10.1016/j.neunet.2021.12.017] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Revised: 12/02/2021] [Accepted: 12/23/2021] [Indexed: 11/30/2022]
Abstract
This paper proposes an Information Bottleneck theory based filter pruning method that uses a statistical measure called Mutual Information (MI). The MI between filters and class labels, also called Relevance, is computed using the filter's activation maps and the annotations. The filters having High Relevance (HRel) are considered to be more important. Consequently, the least important filters, which have lower Mutual Information with the class labels, are pruned. Unlike the existing MI based pruning methods, the proposed method determines the significance of the filters purely based on their corresponding activation map's relationship with the class labels. Architectures such as LeNet-5, VGG-16, ResNet-56, ResNet-110 and ResNet-50 are utilized to demonstrate the efficacy of the proposed pruning method over MNIST, CIFAR-10 and ImageNet datasets. The proposed method shows the state-of-the-art pruning results for LeNet-5, VGG-16, ResNet-56, ResNet-110 and ResNet-50 architectures. In the experiments, we prune 97.98%, 84.85%, 76.89%, 76.95%, and 63.99% of Floating Point Operation (FLOP)s from LeNet-5, VGG-16, ResNet-56, ResNet-110, and ResNet-50 respectively. The proposed HRel pruning method outperforms recent state-of-the-art filter pruning methods. Even after pruning the filters from convolutional layers of LeNet-5 drastically (i.e., from 20, 50 to 2, 3, respectively), only a small accuracy drop of 0.52% is observed. Notably, for VGG-16, 94.98% parameters are reduced, only with a drop of 0.36% in top-1 accuracy. ResNet-50 has shown a 1.17% drop in the top-5 accuracy after pruning 66.42% of the FLOPs. In addition to pruning, the Information Plane dynamics of Information Bottleneck theory is analyzed for various Convolutional Neural Network architectures with the effect of pruning. The code is available at https://github.com/sarvanichinthapalli/HRel.
Collapse
Affiliation(s)
- C H Sarvani
- Computer Vision Group, Indian Institute of Information Technology, Sri City, Chittoor, Andhra Pradesh 517646, India.
| | - Mrinmoy Ghorai
- Computer Vision Group, Indian Institute of Information Technology, Sri City, Chittoor, Andhra Pradesh 517646, India.
| | - Shiv Ram Dubey
- Computer Vision and Biometrics Laboratory, Indian Institute of Information Technology, Allahabad, Uttar Pradesh 211015, India.
| | | |
Collapse
|
31
|
Sheng Q, Sheng H, Gao P, Li Z, Yin H. Real-Time Detection of Cook Assistant Overalls Based on Embedded Reasoning. SENSORS 2021; 21:s21238069. [PMID: 34884074 PMCID: PMC8659890 DOI: 10.3390/s21238069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 11/22/2021] [Accepted: 11/28/2021] [Indexed: 11/16/2022]
Abstract
Currently, the target detection based on convolutional neural network plays an important role in image recognition, speech recognition and other fields. However, the current network model features a complex structure, a huge number of parameters and resources. These conditions make it difficult to apply in embedded devices with limited computational capabilities and extreme sensitivity to power consumption. In this regard, the application scenarios of deep learning are limited. This paper proposes a real-time detection scheme for cook assistant overalls based on the Hi3559A embedded processor. With YOLOv3 as the benchmark network, this scheme fully mobilizes the hardware acceleration resources through the network model optimization and the parallel processing technology of the processor, and improves the network reasoning speed, so that the embedded device can complete the task of real-time detection on the local device. The experimental results show that through the purposeful cropping, segmentation and in-depth optimization of the neural network according to the specific processor, the neural network can recognize the image accurately. In an application environment where the power consumption is only 5.5 W, the recognition speed of the neural network on the embedded end is increased to about 28 frames (the design requirement was to achieve a recognition speed of 25 frames or more), so that the optimized network can be effectively applied in the back kitchen overalls identification scene.
Collapse
|
32
|
Cheng H, Wang Z, Ma L, Liu X, Wei Z. Multi-task Pruning via Filter Index Sharing: A Many-Objective Optimization Approach. Cognit Comput 2021. [DOI: 10.1007/s12559-021-09894-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
AbstractState-of-the-art deep neural network plays an increasingly important role in artificial intelligence, while the huge number of parameters in networks brings high memory cost and computational complexity. To solve this problem, filter pruning is widely used for neural network compression and acceleration. However, existing algorithms focus mainly on pruning single model, and few results are available to multi-task pruning that is capable of pruning multi-model and promoting the learning performance. By utilizing the filter sharing technique, this paper aimed to establish a multi-task pruning framework for simultaneously pruning and merging filters in multi-task networks. An optimization problem of selecting the important filters is solved by developing a many-objective optimization algorithm where three criteria are adopted as objectives for the many-objective optimization problem. With the purpose of keeping the network structure, an index matrix is introduced to regulate the information sharing during multi-task training. The proposed multi-task pruning algorithm is quite flexible that can be performed with either adaptive or pre-specified pruning rates. Extensive experiments are performed to verify the applicability and superiority of the proposed method on both single-task and multi-task pruning.
Collapse
|
33
|
Diao H, Hao Y, Xu S, Li G. Implementation of Lightweight Convolutional Neural Networks via Layer-Wise Differentiable Compression. SENSORS 2021; 21:s21103464. [PMID: 34065680 PMCID: PMC8155900 DOI: 10.3390/s21103464] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Revised: 05/09/2021] [Accepted: 05/14/2021] [Indexed: 11/16/2022]
Abstract
Convolutional neural networks (CNNs) have achieved significant breakthroughs in various domains, such as natural language processing (NLP), and computer vision. However, performance improvement is often accompanied by large model size and computation costs, which make it not suitable for resource-constrained devices. Consequently, there is an urgent need to compress CNNs, so as to reduce model size and computation costs. This paper proposes a layer-wise differentiable compression (LWDC) algorithm for compressing CNNs structurally. A differentiable selection operator OS is embedded in the model to compress and train the model simultaneously by gradient descent in one go. Instead of pruning parameters from redundant operators by contrast to most of the existing methods, our method replaces the original bulky operators with more lightweight ones directly, which only needs to specify the set of lightweight operators and the regularization factor in advance, rather than the compression rate for each layer. The compressed model produced by our method is generic and does not need any special hardware/software support. Experimental results on CIFAR-10, CIFAR-100 and ImageNet have demonstrated the effectiveness of our method. LWDC obtains more significant compression than state-of-the-art methods in most cases, while having lower performance degradation. The impact of lightweight operators and regularization factor on the compression rate and accuracy also is evaluated.
Collapse
Affiliation(s)
- Huabin Diao
- Institute of Microelectronics, Chinese Academy of Sciences, Beijing 100029, China; (H.D.); (Y.H.); (G.L.)
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yuexing Hao
- Institute of Microelectronics, Chinese Academy of Sciences, Beijing 100029, China; (H.D.); (Y.H.); (G.L.)
| | - Shaoyun Xu
- Institute of Microelectronics, Chinese Academy of Sciences, Beijing 100029, China; (H.D.); (Y.H.); (G.L.)
- Correspondence:
| | - Gongyan Li
- Institute of Microelectronics, Chinese Academy of Sciences, Beijing 100029, China; (H.D.); (Y.H.); (G.L.)
| |
Collapse
|
34
|
Field-Applicable Pig Anomaly Detection System Using Vocalization for Embedded Board Implementations. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10196991] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
Failure to quickly and accurately detect abnormal situations, such as the occurrence of infectious diseases, in pig farms can cause significant damage to the pig farms and the pig farming industry of the country. In this study, we propose an economical and lightweight sound-based pig anomaly detection system that can be applicable even in small-scale farms. The system consists of a pipeline structure, starting from sound acquisition to abnormal situation detection, and can be installed and operated in an actual pig farm. It has the following structure that makes it executable on the embedded board TX-2: (1) A module that collects sound signals; (2) A noise-robust preprocessing module that detects sound regions from signals and converts them into spectrograms; and (3) A pig anomaly detection module based on MnasNet, a lightweight deep learning method, to which the 8-bit filter clustering method proposed in this study is applied, reducing its size by 76.3% while maintaining its identification performance. The proposed system recorded an F1-score of 0.947 as a stable pig’s abnormality identification performance, even in various noisy pigpen environments, and the system’s execution time allowed it to perform in real time.
Collapse
|
35
|
Zhang X, Wei Y, Yang Y, Huang TS. SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:3855-3865. [PMID: 32497014 DOI: 10.1109/tcyb.2020.2992433] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
One-shot image semantic segmentation poses a challenging task of recognizing the object regions from unseen categories with only one annotated example as supervision. In this article, we propose a simple yet effective similarity guidance network to tackle the one-shot (SG-One) segmentation problem. We aim at predicting the segmentation mask of a query image with the reference to one densely labeled support image of the same category. To obtain the robust representative feature of the support image, we first adopt a masked average pooling strategy for producing the guidance features by only taking the pixels belonging to the support image into account. We then leverage the cosine similarity to build the relationship between the guidance features and features of pixels from the query image. In this way, the possibilities embedded in the produced similarity maps can be adopted to guide the process of segmenting objects. Furthermore, our SG-One is a unified framework that can efficiently process both support and query images within one network and be learned in an end-to-end manner. We conduct extensive experiments on Pascal VOC 2012. In particular, our SG-One achieves the mIoU score of 46.3%, surpassing the baseline methods.
Collapse
|
36
|
Sun D, Wang Y, Ni D, Wang T. AutoPath: Image-Specific Inference for 3D Segmentation. Front Neurorobot 2020; 14:49. [PMID: 32792934 PMCID: PMC7393252 DOI: 10.3389/fnbot.2020.00049] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Accepted: 06/19/2020] [Indexed: 11/15/2022] Open
Abstract
In recent years, deep convolutional neural networks (CNNs) has made great achievements in the field of medical image segmentation, among which residual structure plays a significant role in the rapid development of CNN-based segmentation. However, the 3D residual networks inevitably bring a huge computational burden to machines for network inference, thus limiting their usages for many real clinical applications. To tackle this issue, we propose AutoPath, an image-specific inference approach for more efficient 3D segmentations. The proposed AutoPath dynamically selects enabled residual blocks regarding different input images during inference, thus effectively reducing total computation without degrading segmentation performance. To achieve this, a policy network is trained using reinforcement learning, by employing the rewards of using a minimal set of residual blocks and meanwhile maintaining accurate segmentation. Experimental results on liver CT dataset show that our approach not only provides efficient inference procedure but also attains satisfactory segmentation performance.
Collapse
Affiliation(s)
- Dong Sun
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Provincial Key Laboratory of Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China
| | - Yi Wang
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Provincial Key Laboratory of Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China
| | - Dong Ni
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Provincial Key Laboratory of Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China
| | - Tianfu Wang
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Provincial Key Laboratory of Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China
| |
Collapse
|