Zheng J, Liu H, Feng Y, Xu J, Zhao L. CASF-Net: Cross-attention and cross-scale fusion network for medical image segmentation.
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023;
229:107307. [PMID:
36571889 DOI:
10.1016/j.cmpb.2022.107307]
[Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 11/22/2022] [Accepted: 12/09/2022] [Indexed: 06/18/2023]
Abstract
BACKGROUND
Automatic segmentation of medical images has progressed greatly owing to the development of convolutional neural networks (CNNs). However, there are two uncertainties with current approaches based on convolutional operations: (1) how to eliminate the general limitations that CNNs lack the ability of modeling long-range dependencies and global contextual interactions, and (2) how to efficiently discover and integrate global and local features that are implied in the image. Notably, these two problems are interconnected, yet previous approaches mainly focus on the first problem and ignore the importance of information integration.
METHODS
In this paper, we propose a novel cross-attention and cross-scale fusion network (CASF-Net), which aims to explicitly tap the potential of dual-branch networks and fully integrate the coarse and fine-grained feature representations. Specifically, the well-designed dual-branch encoder hammers at modeling non-local dependencies and multi-scale contexts, significantly improving the quality of semantic segmentation. Moreover, the proposed cross-attention and cross-scale module efficiently perform multi-scale information fusion, being capable of further exploring the long-range contextual information.
RESULTS
Extensive experiments conducted on three different types of medical image segmentation tasks demonstrate the state-of-the-art performance of our proposed method both visually and numerically.
CONCLUSIONS
This paper assembles the feature representation capabilities of CNN and transformer and proposes cross-attention and cross-scale fusion algorithms. The promising results show new possibilities of using cross-fusion mechanisms in more downstream medical image tasks.
Collapse