1
|
Liu H, Zhuang Y, Song E, Liao Y, Ye G, Yang F, Xu X, Xiao X, Hung CC. A 3D boundary-guided hybrid network with convolutions and Transformers for lung tumor segmentation in CT images. Comput Biol Med 2024; 180:109009. [PMID: 39137673 DOI: 10.1016/j.compbiomed.2024.109009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2024] [Revised: 07/19/2024] [Accepted: 08/06/2024] [Indexed: 08/15/2024]
Abstract
-Accurate lung tumor segmentation from Computed Tomography (CT) scans is crucial for lung cancer diagnosis. Since the 2D methods lack the volumetric information of lung CT images, 3D convolution-based and Transformer-based methods have recently been applied in lung tumor segmentation tasks using CT imaging. However, most existing 3D methods cannot effectively collaborate the local patterns learned by convolutions with the global dependencies captured by Transformers, and widely ignore the important boundary information of lung tumors. To tackle these problems, we propose a 3D boundary-guided hybrid network using convolutions and Transformers for lung tumor segmentation, named BGHNet. In BGHNet, we first propose the Hybrid Local-Global Context Aggregation (HLGCA) module with parallel convolution and Transformer branches in the encoding phase. To aggregate local and global contexts in each branch of the HLGCA module, we not only design the Volumetric Cross-Stripe Window Transformer (VCSwin-Transformer) to build the Transformer branch with local inductive biases and large receptive fields, but also design the Volumetric Pyramid Convolution with transformer-based extensions (VPConvNeXt) to build the convolution branch with multi-scale global information. Then, we present a Boundary-Guided Feature Refinement (BGFR) module in the decoding phase, which explicitly leverages the boundary information to refine multi-stage decoding features for better performance. Extensive experiments were conducted on two lung tumor segmentation datasets, including a private dataset (HUST-Lung) and a public benchmark dataset (MSD-Lung). Results show that BGHNet outperforms other state-of-the-art 2D or 3D methods in our experiments, and it exhibits superior generalization performance in both non-contrast and contrast-enhanced CT scans.
Collapse
Affiliation(s)
- Hong Liu
- Center for Biomedical Imaging and Bioinformatics, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Yuzhou Zhuang
- Center for Biomedical Imaging and Bioinformatics, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Enmin Song
- Center for Biomedical Imaging and Bioinformatics, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Yongde Liao
- Department of Thoracic Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Guanchao Ye
- Department of Thoracic Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Fan Yang
- Department of Radiology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Xiangyang Xu
- Center for Biomedical Imaging and Bioinformatics, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Xvhao Xiao
- Research Institute of High-Tech, Xi'an, 710025, Shaanxi, China.
| | - Chih-Cheng Hung
- Center for Machine Vision and Security Research, Kennesaw State University, Marietta, MA, 30060, USA.
| |
Collapse
|
2
|
Zou Z, Zou B, Kui X, Chen Z, Li Y. DGCBG-Net: A dual-branch network with global cross-modal interaction and boundary guidance for tumor segmentation in PET/CT images. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 250:108125. [PMID: 38631130 DOI: 10.1016/j.cmpb.2024.108125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 02/24/2024] [Accepted: 03/07/2024] [Indexed: 04/19/2024]
Abstract
BACKGROUND AND OBJECTIVES Automatic tumor segmentation plays a crucial role in cancer diagnosis and treatment planning. Computed tomography (CT) and positron emission tomography (PET) are extensively employed for their complementary medical information. However, existing methods ignore bilateral cross-modal interaction of global features during feature extraction, and they underutilize multi-stage tumor boundary features. METHODS To address these limitations, we propose a dual-branch tumor segmentation network based on global cross-modal interaction and boundary guidance in PET/CT images (DGCBG-Net). DGCBG-Net consists of 1) a global cross-modal interaction module that extracts global contextual information from PET/CT images and promotes bilateral cross-modal interaction of global feature; 2) a shared multi-path downsampling module that learns complementary features from PET/CT modalities to mitigate the impact of misleading features and decrease the loss of discriminative features during downsampling; 3) a boundary prior-guided branch that extracts potential boundary features from CT images at multiple stages, assisting the semantic segmentation branch in improving the accuracy of tumor boundary segmentation. RESULTS Extensive experiments are conducted on STS and Hecktor 2022 datasets to evaluate the proposed method. The average Dice scores of our DGCB-Net on the two datasets are 80.33% and 79.29%, with average IOU scores of 67.64% and 70.18%. DGCB-Net outperformed the current state-of-the-art methods with a 1.77% higher Dice score and a 2.12% higher IOU score. CONCLUSIONS Extensive experimental results demonstrate that DGCBG-Net outperforms existing segmentation methods, and is competitive to state-of-arts.
Collapse
Affiliation(s)
- Ziwei Zou
- School of Computer Science and Engineering, Central South University, No. 932, Lushan South Road, ChangSha, 410083, China
| | - Beiji Zou
- School of Computer Science and Engineering, Central South University, No. 932, Lushan South Road, ChangSha, 410083, China
| | - Xiaoyan Kui
- School of Computer Science and Engineering, Central South University, No. 932, Lushan South Road, ChangSha, 410083, China.
| | - Zhi Chen
- School of Computer Science and Engineering, Central South University, No. 932, Lushan South Road, ChangSha, 410083, China
| | - Yang Li
- School of Informatics, Hunan University of Chinese Medicine, No. 300, Xueshi Road, ChangSha, 410208, China
| |
Collapse
|
3
|
Zhao Y, Zhou X, Pan T, Gao S, Zhang W. Correspondence-based Generative Bayesian Deep Learning for semi-supervised volumetric medical image segmentation. Comput Med Imaging Graph 2024; 113:102352. [PMID: 38341947 DOI: 10.1016/j.compmedimag.2024.102352] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 02/03/2024] [Accepted: 02/03/2024] [Indexed: 02/13/2024]
Abstract
Automated medical image segmentation plays a crucial role in diverse clinical applications. The high annotation costs of fully-supervised medical segmentation methods have spurred a growing interest in semi-supervised methods. Existing semi-supervised medical segmentation methods train the teacher segmentation network using labeled data to establish pseudo labels for unlabeled data. The quality of these pseudo labels is constrained as these methods fail to effectively address the significant bias in the data distribution learned from the limited labeled data. To address these challenges, this paper introduces an innovative Correspondence-based Generative Bayesian Deep Learning (C-GBDL) model. Built upon the teacher-student architecture, we design a multi-scale semantic correspondence method to aid the teacher model in generating high-quality pseudo labels. Specifically, our teacher model, embedded with the multi-scale semantic correspondence, learns a better-generalized data distribution from input volumes by feature matching with the reference volumes. Additionally, a double uncertainty estimation schema is proposed to further rectify the noisy pseudo labels. The double uncertainty estimation takes the predictive entropy as the first uncertainty estimation and takes the structural similarity between the input volume and its corresponding reference volumes as the second uncertainty estimation. Four groups of comparative experiments conducted on two public medical datasets demonstrate the effectiveness and the superior performance of our proposed model. Our code is available on https://github.com/yumjoo/C-GBDL.
Collapse
Affiliation(s)
- Yuzhou Zhao
- Shanghai Key Lab of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai, China
| | - Xinyu Zhou
- Shanghai Key Lab of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai, China
| | - Tongxin Pan
- Shanghai Key Lab of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai, China
| | - Shuyong Gao
- Shanghai Key Lab of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai, China.
| | - Wenqiang Zhang
- Shanghai Key Lab of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai, China; Shanghai Engineering Research Center of AI & Robotics, Academy for Engineering and Technology, Fudan University, Shanghai, China.
| |
Collapse
|