1
|
Díaz-Francés JÁ, Fernández-Rodríguez JD, Thurnhofer-Hemsi K, López-Rubio E. Semi-Supervised Semantic Image Segmentation by Deep Diffusion Models and Generative Adversarial Networks. Int J Neural Syst 2024; 34:2450057. [PMID: 39155691 DOI: 10.1142/s0129065724500576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/20/2024]
Abstract
Typically, deep learning models for image segmentation tasks are trained using large datasets of images annotated at the pixel level, which can be expensive and highly time-consuming. A way to reduce the amount of annotated images required for training is to adopt a semi-supervised approach. In this regard, generative deep learning models, concretely Generative Adversarial Networks (GANs), have been adapted to semi-supervised training of segmentation tasks. This work proposes MaskGDM, a deep learning architecture combining some ideas from EditGAN, a GAN that jointly models images and their segmentations, together with a generative diffusion model. With careful integration, we find that using a generative diffusion model can improve EditGAN performance results in multiple segmentation datasets, both multi-class and with binary labels. According to the quantitative results obtained, the proposed model improves multi-class image segmentation when compared to the EditGAN and DatasetGAN models, respectively, by [Formula: see text] and [Formula: see text]. Moreover, using the ISIC dataset, our proposal improves the results from other models by up to [Formula: see text] for the binary image segmentation approach.
Collapse
Affiliation(s)
- José Ángel Díaz-Francés
- ITIS Software, University of Málaga, Calle Arquitecto Francisco Peñalosa 18, Málaga 29010, Spain
| | | | - Karl Thurnhofer-Hemsi
- ITIS Software, University of Málaga, Calle Arquitecto Francisco Peñalosa 18, Málaga 29010, Spain
| | - Ezequiel López-Rubio
- ITIS Software, University of Málaga, Calle Arquitecto Francisco Peñalosa 18, Málaga 29010, Spain
| |
Collapse
|
2
|
Zhou Q, Yu B, Xiao F, Ding M, Wang Z, Zhang X. Robust Semi-Supervised 3D Medical Image Segmentation With Diverse Joint-Task Learning and Decoupled Inter-Student Learning. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2317-2331. [PMID: 38319753 DOI: 10.1109/tmi.2024.3362837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/08/2024]
Abstract
Semi-supervised segmentation is highly significant in 3D medical image segmentation. The typical solutions adopt a teacher-student dual-model architecture, and they constrain the two models' decision consistency on the same segmentation task. However, the scarcity of medical samples can lower the diversity of tasks, reducing the effectiveness of consistency constraint. The issue can further worsen as the weights of the models gradually become synchronized. In this work, we have proposed to construct diverse joint-tasks using masked image modelling for enhancing the reliability of the consistency constraint, and develop a novel architecture consisting of a single teacher but multiple students to enjoy the additional knowledge decoupled from the synchronized weights. Specifically, the teacher and student models 'see' varied randomly-masked versions of an input, and are trained to segment the same targets but reconstruct different missing regions concurrently. Such joint-task of segmentation and reconstruction can have the two learners capture related but complementary features to derive instructive knowledge when constraining their consistency. Moreover, two extra students join the original one to perform an inter-student learning. The three students share the same encoding but different decoding designs, and learn decoupled knowledge by constraining their mutual consistencies, preventing themselves from suboptimally converging to the biased predictions of the dictatorial teacher. Experimental on four medical datasets show that our approach performs better than six mainstream semi-supervised methods. Particularly, our approach achieves at least 0.61% and 0.36% higher Dice and Jaccard values, respectively, than the most competitive approach on our in-house dataset. The code will be released at https://github.com/zxmboshi/DDL.
Collapse
|
3
|
Zhou C, Ye L, Peng H, Liu Z, Wang J, Ramírez-De-Arellano A. A Parallel Convolutional Network Based on Spiking Neural Systems. Int J Neural Syst 2024; 34:2450022. [PMID: 38487872 DOI: 10.1142/s0129065724500229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/28/2024]
Abstract
Deep convolutional neural networks have shown advanced performance in accurately segmenting images. In this paper, an SNP-like convolutional neuron structure is introduced, abstracted from the nonlinear mechanism in nonlinear spiking neural P (NSNP) systems. Then, a U-shaped convolutional neural network named SNP-like parallel-convolutional network, or SPC-Net, is constructed for segmentation tasks. The dual-convolution concatenate (DCC) and dual-convolution addition (DCA) network blocks are designed, respectively, in the encoder and decoder stages. The two blocks employ parallel convolution with different kernel sizes to improve feature representation ability and make full use of spatial detail information. Meanwhile, different feature fusion strategies are used to fuse their features to achieve feature complementarity and augmentation. Furthermore, a dual-scale pooling (DSP) module in the bottleneck is designed to improve the feature extraction capability, which can extract multi-scale contextual information and reduce information loss while extracting salient features. The SPC-Net is applied in medical image segmentation tasks and is compared with several recent segmentation methods on the GlaS and CRAG datasets. The proposed SPC-Net achieves 90.77% DICE coefficient, 83.76% IoU score and 83.93% F1 score, 86.33% ObjDice coefficient, 135.60 Obj-Hausdorff distance, respectively. The experimental results show that the proposed model can achieve good segmentation performance.
Collapse
Affiliation(s)
- Chi Zhou
- School of Computer and Software Engineering, Xihua University, Chengdu 610039, P. R. China
| | - Lulin Ye
- School of Computer and Software Engineering, Xihua University, Chengdu 610039, P. R. China
| | - Hong Peng
- School of Computer and Software Engineering, Xihua University, Chengdu 610039, P. R. China
| | - Zhicai Liu
- School of Computer and Software Engineering, Xihua University, Chengdu 610039, P. R. China
| | - Jun Wang
- School of Electrical Engineering and Electronic Information, Xihua University, Chengdu 610039, P. R. China
| | - Antonio Ramírez-De-Arellano
- Research Group of Natural Computing, Department of Computer Science and Artificial Intelligence, University of Seville, Sevilla 41012, Spain
| |
Collapse
|
4
|
Chen L, Leng L, Yang Z, Teoh ABJ. Enhanced Multitask Learning for Hash Code Generation of Palmprint Biometrics. Int J Neural Syst 2024; 34:2450020. [PMID: 38414422 DOI: 10.1142/s0129065724500205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/29/2024]
Abstract
This paper presents a novel multitask learning framework for palmprint biometrics, which optimizes classification and hashing branches jointly. The classification branch within our framework facilitates the concurrent execution of three distinct tasks: identity recognition and classification of soft biometrics, encompassing gender and chirality. On the other hand, the hashing branch enables the generation of palmprint hash codes, optimizing for minimal storage as templates and efficient matching. The hashing branch derives the complementary information from these tasks by amalgamating knowledge acquired from the classification branch. This approach leads to superior overall performance compared to individual tasks in isolation. To enhance the effectiveness of multitask learning, two additional modules, an attention mechanism module and a customized gate control module, are introduced. These modules are vital in allocating higher weights to crucial channels and facilitating task-specific expert knowledge integration. Furthermore, an automatic weight adjustment module is incorporated to optimize the learning process further. This module fine-tunes the weights assigned to different tasks, improving performance. Integrating the three modules above has shown promising accuracies across various classification tasks and has notably improved authentication accuracy. The extensive experimental results validate the efficacy of our proposed framework.
Collapse
Affiliation(s)
- Lin Chen
- Key Laboratory of Jiangxi Province for Image Processing and Pattern Recognition, Nanchang Hangkong University, Nanchang, Jiangxi, P. R. China
| | - Lu Leng
- Key Laboratory of Jiangxi Province for Image Processing and Pattern Recognition, Nanchang Hangkong University, Nanchang, Jiangxi, P. R. China
| | - Ziyuan Yang
- College of Computer Science, Sichuan University, Chengdu, Sichuan, P. R. China
| | - Andrew Beng Jin Teoh
- School of Electrical and Electronic Engineering, College of Engineering, Yonsei University Seoul, Republic of Korea
| |
Collapse
|
5
|
Ding W, Li Z. Curriculum Consistency Learning and Multi-Scale Contrastive Constraint in Semi-Supervised Medical Image Segmentation. Bioengineering (Basel) 2023; 11:10. [PMID: 38247886 PMCID: PMC10812906 DOI: 10.3390/bioengineering11010010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 07/24/2023] [Accepted: 07/27/2023] [Indexed: 01/23/2024] Open
Abstract
Data scarcity poses a significant challenge in medical image segmentation, thereby highlighting the importance of leveraging sparse annotation data. In addressing this issue, semi-supervised learning has emerged as an effective approach for training neural networks using limited labeled data. In this study, we introduced a curriculum consistency constraint within the context of semi-supervised medical image segmentation, thus drawing inspiration from the human learning process. By dynamically comparing patch features with full image features, we enhanced the network's ability to learn. Unlike existing methods, our approach adapted the patch size to simulate the human curriculum process, thereby progressing from easy to hard tasks. This adjustment guided the model toward improved convergence optima and generalization. Furthermore, we employed multi-scale contrast learning to enhance the representation of features. Our method capitalizes on the features extracted from multiple layers to explore additional semantic information and point-wise representations. To evaluate the effectiveness of our proposed approach, we conducted experiments on the Kvasir-SEG polyp dataset and the ISIC 2018 skin lesion dataset. The experimental results demonstrated that our method surpassed state-of-the-art semi-supervised methods by achieving a 9.2% increase in the mean intersection over union (mIoU) for the Kvasir-SEG dataset. This improvement substantiated the efficacy of our proposed curriculum consistency constraint and multi-scale contrastive loss.
Collapse
Affiliation(s)
| | - Zhen Li
- Department of Computer and Information Engineering, School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen), Shenzhen 518000, China;
| |
Collapse
|
6
|
Cui J, Xiao J, Hou Y, Wu X, Zhou J, Peng X, Wang Y. Unsupervised Domain Adaptive Dose Prediction via Cross-Attention Transformer and Target-Specific Knowledge Preservation. Int J Neural Syst 2023; 33:2350057. [PMID: 37771298 DOI: 10.1142/s0129065723500570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/30/2023]
Abstract
Radiotherapy is one of the leading treatments for cancer. To accelerate the implementation of radiotherapy in clinic, various deep learning-based methods have been developed for automatic dose prediction. However, the effectiveness of these methods heavily relies on the availability of a substantial amount of data with labels, i.e. the dose distribution maps, which cost dosimetrists considerable time and effort to acquire. For cancers of low-incidence, such as cervical cancer, it is often a luxury to collect an adequate amount of labeled data to train a well-performing deep learning (DL) model. To mitigate this problem, in this paper, we resort to the unsupervised domain adaptation (UDA) strategy to achieve accurate dose prediction for cervical cancer (target domain) by leveraging the well-labeled high-incidence rectal cancer (source domain). Specifically, we introduce the cross-attention mechanism to learn the domain-invariant features and develop a cross-attention transformer-based encoder to align the two different cancer domains. Meanwhile, to preserve the target-specific knowledge, we employ multiple domain classifiers to enforce the network to extract more discriminative target features. In addition, we employ two independent convolutional neural network (CNN) decoders to compensate for the lack of spatial inductive bias in the pure transformer and generate accurate dose maps for both domains. Furthermore, to enhance the performance, two additional losses, i.e. a knowledge distillation loss (KDL) and a domain classification loss (DCL), are incorporated to transfer the domain-invariant features while preserving domain-specific information. Experimental results on a rectal cancer dataset and a cervical cancer dataset have demonstrated that our method achieves the best quantitative results with [Formula: see text], [Formula: see text], and HI of 1.446, 1.231, and 0.082, respectively, and outperforms other methods in terms of qualitative assessment.
Collapse
Affiliation(s)
- Jiaqi Cui
- School of Computer Science, Sichuan University, Chengdu, P. R. China
| | - Jianghong Xiao
- Department of Radiation Oncology, Cancer Center, West China Hospital, Sichuan University, Chengdu, P. R. China
| | - Yun Hou
- Agile and Intelligent Computing Key Laboratory, Southwest China Institute of Electronic Technology, Chengdu, P. R. China
| | - Xi Wu
- School of Computer Science, Chengdu University of Information Technology, P. R. China
| | - Jiliu Zhou
- School of Computer Science, Sichuan University, Chengdu, P. R. China
| | - Xingchen Peng
- Department of Biotherapy, Cancer Center, West China Hospital, Sichuan University, Chengdu, P. R. China
| | - Yan Wang
- School of Computer Science, Sichuan University, Chengdu, P. R. China
| |
Collapse
|
7
|
Singh P, Li Y, Sikarwar A, Lei W, Gao D, Talbot MB, Sun Y, Shou MZ, Kreiman G, Zhang M. Learning to Learn: How to Continuously Teach Humans and Machines. ... IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS. IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION 2023; 2023:11674-11685. [PMID: 38784111 PMCID: PMC11114607 DOI: 10.1109/iccv51070.2023.01075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
Curriculum design is a fundamental component of education. For example, when we learn mathematics at school, we build upon our knowledge of addition to learn multiplication. These and other concepts must be mastered before our first algebra lesson, which also reinforces our addition and multiplication skills. Designing a curriculum for teaching either a human or a machine shares the underlying goal of maximizing knowledge transfer from earlier to later tasks, while also minimizing forgetting of learned tasks. Prior research on curriculum design for image classification focuses on the ordering of training examples during a single offline task. Here, we investigate the effect of the order in which multiple distinct tasks are learned in a sequence. We focus on the online class-incremental continual learning setting, where algorithms or humans must learn image classes one at a time during a single pass through a dataset. We find that curriculum consistently influences learning outcomes for humans and for multiple continual machine learning algorithms across several benchmark datasets. We introduce a novel-object recognition dataset for human curriculum learning experiments and observe that curricula that are effective for humans are highly correlated with those that are effective for machines. As an initial step towards automated curriculum design for online class-incremental learning, we propose a novel algorithm, dubbed Curriculum Designer (CD), that designs and ranks curricula based on inter-class feature similarities. We find significant overlap between curricula that are empirically highly effective and those that are highly ranked by our CD. Our study establishes a framework for further research on teaching humans and machines to learn continuously using optimized curricula. Our code and data are available through this link.
Collapse
Affiliation(s)
- Parantak Singh
- Nanyang Technological University (NTU), Singapore
- CFAR and I2R, Agency for Science, Technology and Research, Singapore
| | - You Li
- CFAR and I2R, Agency for Science, Technology and Research, Singapore
- University of Wisconsin-Madison, USA
| | - Ankur Sikarwar
- Nanyang Technological University (NTU), Singapore
- CFAR and I2R, Agency for Science, Technology and Research, Singapore
| | - Weixian Lei
- Show Lab, National University of Singapore, Singapore
| | - Difei Gao
- Show Lab, National University of Singapore, Singapore
| | - Morgan B Talbot
- Boston Children's Hospital, Harvard Medical School, USA
- Harvard-MIT Health Sciences and Technology, MIT
| | - Ying Sun
- CFAR and I2R, Agency for Science, Technology and Research, Singapore
| | | | | | - Mengmi Zhang
- Nanyang Technological University (NTU), Singapore
- CFAR and I2R, Agency for Science, Technology and Research, Singapore
| |
Collapse
|
8
|
Zhu Y, Wang Y, Chen H, Guo Z, Huang Q. Large-Scale Image Retrieval with Deep Attentive Global Features. Int J Neural Syst 2023; 33:2350013. [PMID: 36846979 DOI: 10.1142/s0129065723500132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
How to obtain discriminative features has proved to be a core problem for image retrieval. Many recent works use convolutional neural networks to extract features. However, clutter and occlusion will interfere with the distinguishability of features when using convolutional neural network (CNN) for feature extraction. To address this problem, we intend to obtain high-response activations in the feature map based on the attention mechanism. We propose two attention modules, a spatial attention module and a channel attention module. For the spatial attention module, we first capture the global information and model the relation between channels as a region evaluator, which evaluates and assigns new weights to local features. For the channel attention module, we use a vector with trainable parameters to weight the importance of each feature map. The two attention modules are cascaded to adjust the weight distribution for the feature map, which makes the extracted features more discriminative. Furthermore, we present a scale and mask scheme to scale the major components and filter out the meaningless local features. This scheme can reduce the disadvantages of the various scales of the major components in images by applying multiple scale filters, and filter out the redundant features with the MAX-Mask. Exhaustive experiments demonstrate that the two attention modules are complementary to improve performance, and our network with the three modules outperforms the state-of-the-art methods on four well-known image retrieval datasets.
Collapse
Affiliation(s)
- Yingying Zhu
- College of Computer Science and Software Engineering, Shenzhen University, Nanhai Ave 3688, Shenzhen, Guangdong 518060, P. R. China
| | - Yinghao Wang
- College of Computer Science and Software Engineering, Shenzhen University, Nanhai Ave 3688, Shenzhen, Guangdong 518060, P. R. China
| | - Haonan Chen
- College of Computer Science and Software Engineering, Shenzhen University, Nanhai Ave 3688, Shenzhen, Guangdong 518060, P. R. China
| | - Zemian Guo
- College of Computer Science and Software Engineering, Shenzhen University, Nanhai Ave 3688, Shenzhen, Guangdong 518060, P. R. China
| | - Qiang Huang
- College of Computer Science and Software Engineering, Shenzhen University, Nanhai Ave 3688, Shenzhen, Guangdong 518060, P. R. China
| |
Collapse
|
9
|
Zhan B, Zhou L, Li Z, Wu X, Pu Y, Zhou J, Wang Y, Shen D. D2FE-GAN: Decoupled dual feature extraction based GAN for MRI image synthesis. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|