1
|
Xue H, Yonggang L, Min L, Lin L. A lighter hybrid feature fusion framework for polyp segmentation. Sci Rep 2024; 14:23179. [PMID: 39369043 PMCID: PMC11455952 DOI: 10.1038/s41598-024-72763-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Accepted: 09/10/2024] [Indexed: 10/07/2024] Open
Abstract
Colonoscopy is widely recognized as the most effective method for the detection of colon polyps, which is crucial for early screening of colorectal cancer. Polyp identification and segmentation in colonoscopy images require specialized medical knowledge and are often labor-intensive and expensive. Deep learning provides an intelligent and efficient approach for polyp segmentation. However, the variability in polyp size and the heterogeneity of polyp boundaries and interiors pose challenges for accurate segmentation. Currently, Transformer-based methods have become a mainstream trend for polyp segmentation. However, these methods tend to overlook local details due to the inherent characteristics of Transformer, leading to inferior results. Moreover, the computational burden brought by self-attention mechanisms hinders the practical application of these models. To address these issues, we propose a novel CNN-Transformer hybrid model for polyp segmentation (CTHP). CTHP combines the strengths of CNN, which excels at modeling local information, and Transformer, which excels at modeling global semantics, to enhance segmentation accuracy. We transform the self-attention computation over the entire feature map into the width and height directions, significantly improving computational efficiency. Additionally, we design a new information propagation module and introduce additional positional bias coefficients during the attention computation process, which reduces the dispersal of information introduced by deep and mixed feature fusion in the Transformer. Extensive experimental results demonstrate that our proposed model achieves state-of-the-art performance on multiple benchmark datasets for polyp segmentation. Furthermore, cross-domain generalization experiments show that our model exhibits excellent generalization performance.
Collapse
Affiliation(s)
- He Xue
- Department of Anesthesia Surgery, The Affiliated Huaian No.1 People's Hospital of Nanjing Medical University, Huai'an, 223300, China
| | - Luo Yonggang
- Department of Cardiothoracic Surgery, The Affiliated Huaian No.1 People's Hospital of Nanjing Medical University, Huai'an, 223300, China
| | - Liu Min
- Department of Laboratory Medicine, The Affiliated Huaian No.1 People's Hospital of Nanjing Medical University, Huai'an, 223300, China
| | - Li Lin
- Department of Anesthesia Surgery, The Affiliated Huaian No.1 People's Hospital of Nanjing Medical University, Huai'an, 223300, China.
| |
Collapse
|
2
|
Xu W, Xu R, Wang C, Li X, Xu S, Guo L. PSTNet: Enhanced Polyp Segmentation With Multi-Scale Alignment and Frequency Domain Integration. IEEE J Biomed Health Inform 2024; 28:6042-6053. [PMID: 38954569 DOI: 10.1109/jbhi.2024.3421550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/04/2024]
Abstract
Accurate segmentation of colorectal polyps in colonoscopy images is crucial for effective diagnosis and management of colorectal cancer (CRC). However, current deep learning-based methods primarily rely on fusing RGB information across multiple scales, leading to limitations in accurately identifying polyps due to restricted RGB domain information and challenges in feature misalignment during multi-scale aggregation. To address these limitations, we propose the Polyp Segmentation Network with Shunted Transformer (PSTNet), a novel approach that integrates both RGB and frequency domain cues present in the images. PSTNet comprises three key modules: the Frequency Characterization Attention Module (FCAM) for extracting frequency cues and capturing polyp characteristics, the Feature Supplementary Alignment Module (FSAM) for aligning semantic information and reducing misalignment noise, and the Cross Perception localization Module (CPM) for synergizing frequency cues with high-level semantics to achieve efficient polyp segmentation. Extensive experiments on challenging datasets demonstrate PSTNet's significant improvement in polyp segmentation accuracy across various metrics, consistently outperforming state-of-the-art methods. The integration of frequency domain cues and the novel architectural design of PSTNet contribute to advancing computer-assisted polyp segmentation, facilitating more accurate diagnosis and management of CRC.
Collapse
|
3
|
Du X, Xu X, Chen J, Zhang X, Li L, Liu H, Li S. UM-Net: Rethinking ICGNet for polyp segmentation with uncertainty modeling. Med Image Anal 2024; 99:103347. [PMID: 39316997 DOI: 10.1016/j.media.2024.103347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Revised: 05/26/2024] [Accepted: 09/10/2024] [Indexed: 09/26/2024]
Abstract
Automatic segmentation of polyps from colonoscopy images plays a critical role in the early diagnosis and treatment of colorectal cancer. Nevertheless, some bottlenecks still exist. In our previous work, we mainly focused on polyps with intra-class inconsistency and low contrast, using ICGNet to solve them. Due to the different equipment, specific locations and properties of polyps, the color distribution of the collected images is inconsistent. ICGNet was designed primarily with reverse-contour guide information and local-global context information, ignoring this inconsistent color distribution, which leads to overfitting problems and makes it difficult to focus only on beneficial image content. In addition, a trustworthy segmentation model should not only produce high-precision results but also provide a measure of uncertainty to accompany its predictions so that physicians can make informed decisions. However, ICGNet only gives the segmentation result and lacks the uncertainty measure. To cope with these novel bottlenecks, we further extend the original ICGNet to a comprehensive and effective network (UM-Net) with two main contributions that have been proved by experiments to have substantial practical value. Firstly, we employ a color transfer operation to weaken the relationship between color and polyps, making the model more concerned with the shape of the polyps. Secondly, we provide the uncertainty to represent the reliability of the segmentation results and use variance to rectify uncertainty. Our improved method is evaluated on five polyp datasets, which shows competitive results compared to other advanced methods in both learning ability and generalization capability. The source code is available at https://github.com/dxqllp/UM-Net.
Collapse
Affiliation(s)
- Xiuquan Du
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei, China; School of Computer Science and Technology, Anhui University, Hefei, China
| | - Xuebin Xu
- School of Computer Science and Technology, Anhui University, Hefei, China
| | - Jiajia Chen
- School of Computer Science and Technology, Anhui University, Hefei, China
| | - Xuejun Zhang
- School of Computer Science and Technology, Anhui University, Hefei, China
| | - Lei Li
- Department of Neurology, Shuyang Affiliated Hospital of Nanjing University of Traditional Chinese Medicine, Suqian, China.
| | - Heng Liu
- Department of Gastroenterology, The First Affiliated Hospital of Anhui Medical University, Hefei, China
| | - Shuo Li
- Department of Biomedical Engineering, Case Western Reserve University, Cleveland, USA
| |
Collapse
|
4
|
Mineo R, Salanitri FP, Bellitto G, Kavasidis I, Filippo OD, Millesimo M, Ferrari GMD, Aldinucci M, Giordano D, Palazzo S, D'Ascenzo F, Spampinato C. A Convolutional-Transformer Model for FFR and iFR Assessment From Coronary Angiography. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2866-2877. [PMID: 38954582 DOI: 10.1109/tmi.2024.3383283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2024]
Abstract
The quantification of stenosis severity from X-ray catheter angiography is a challenging task. Indeed, this requires to fully understand the lesion's geometry by analyzing dynamics of the contrast material, only relying on visual observation by clinicians. To support decision making for cardiac intervention, we propose a hybrid CNN-Transformer model for the assessment of angiography-based non-invasive fractional flow-reserve (FFR) and instantaneous wave-free ratio (iFR) of intermediate coronary stenosis. Our approach predicts whether a coronary artery stenosis is hemodynamically significant and provides direct FFR and iFR estimates. This is achieved through a combination of regression and classification branches that forces the model to focus on the cut-off region of FFR (around 0.8 FFR value), which is highly critical for decision-making. We also propose a spatio-temporal factorization mechanisms that redesigns the transformer's self-attention mechanism to capture both local spatial and temporal interactions between vessel geometry, blood flow dynamics, and lesion morphology. The proposed method achieves state-of-the-art performance on a dataset of 778 exams from 389 patients. Unlike existing methods, our approach employs a single angiography view and does not require knowledge of the key frame; supervision at training time is provided by a classification loss (based on a threshold of the FFR/iFR values) and a regression loss for direct estimation. Finally, the analysis of model interpretability and calibration shows that, in spite of the complexity of angiographic imaging data, our method can robustly identify the location of the stenosis and correlate prediction uncertainty to the provided output scores.
Collapse
|
5
|
Fan X, Zhou J, Jiang X, Xin M, Hou L. CSAP-UNet: Convolution and self-attention paralleling network for medical image segmentation with edge enhancement. Comput Biol Med 2024; 172:108265. [PMID: 38461698 DOI: 10.1016/j.compbiomed.2024.108265] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Revised: 02/14/2024] [Accepted: 03/06/2024] [Indexed: 03/12/2024]
Abstract
Convolution operation is performed within a local window of the input image. Therefore, convolutional neural network (CNN) is skilled in obtaining local information. Meanwhile, the self-attention (SA) mechanism extracts features by calculating the correlation between tokens from all positions in the image, which has advantage in obtaining global information. Therefore, the two modules can complement each other to improve feature extraction ability. An effective fusion method is a problem worthy of further study. In this paper, we propose a CNN and SA paralleling network CSAP-UNet with U-Net as backbone. The encoder consists of two parallel branches of CNN and Transformer to extract the feature from the input image, which takes into account both the global dependencies and the local information. Because medical images come from certain frequency bands within the spectrum, their color channels are not as uniform as natural images. Meanwhile, medical segmentation pays more attention to lesion regions in the image. Attention fusion module (AFM) integrates channel attention and spatial attention in series to fuse the output features of the two branches. The medical image segmentation task is essentially to locate the boundary of the object in the image. The boundary enhancement module (BEM) is designed in the shallow layer of the proposed network to focus more specifically on pixel-level edge details. Experimental results on three public datasets validate that CSAP-UNet outperforms state-of-the-art networks, particularly on the ISIC 2017 dataset. The cross-dataset evaluation on Kvasir and CVC-ClinicDB shows that CSAP-UNet has strong generalization ability. Ablation experiments also indicate the effectiveness of the designed modules. The code for training and test is available at https://github.com/zhouzhou1201/CSAP-UNet.git.
Collapse
Affiliation(s)
- Xiaodong Fan
- Faculty of Electrical and Control Engineering, Liaoning Technical University, Huludao, 125105, Liaoning, China.
| | - Jing Zhou
- College of Mathematics, Bohai University, Jinzhou, 121013, Liaoning, China
| | - Xiaoli Jiang
- College of Mathematics, Bohai University, Jinzhou, 121013, Liaoning, China
| | - Meizhuo Xin
- College of Mathematics, Bohai University, Jinzhou, 121013, Liaoning, China
| | - Limin Hou
- Faculty of Electrical and Control Engineering, Liaoning Technical University, Huludao, 125105, Liaoning, China
| |
Collapse
|
6
|
Chen TH, Wang YT, Wu CH, Kuo CF, Cheng HT, Huang SW, Lee C. A colonial serrated polyp classification model using white-light ordinary endoscopy images with an artificial intelligence model and TensorFlow chart. BMC Gastroenterol 2024; 24:99. [PMID: 38443794 PMCID: PMC10913269 DOI: 10.1186/s12876-024-03181-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 02/19/2024] [Indexed: 03/07/2024] Open
Abstract
In this study, we implemented a combination of data augmentation and artificial intelligence (AI) model-Convolutional Neural Network (CNN)-to help physicians classify colonic polyps into traditional adenoma (TA), sessile serrated adenoma (SSA), and hyperplastic polyp (HP). We collected ordinary endoscopy images under both white and NBI lights. Under white light, we collected 257 images of HP, 423 images of SSA, and 60 images of TA. Under NBI light, were collected 238 images of HP, 284 images of SSA, and 71 images of TA. We implemented the CNN-based artificial intelligence model, Inception V4, to build a classification model for the types of colon polyps. Our final AI classification model with data augmentation process is constructed only with white light images. Our classification prediction accuracy of colon polyp type is 94%, and the discriminability of the model (area under the curve) was 98%. Thus, we can conclude that our model can help physicians distinguish between TA, SSA, and HPs and correctly identify precancerous lesions such as TA and SSA.
Collapse
Affiliation(s)
- Tsung-Hsing Chen
- Department of Gastroenterology and Hepatology, Linkou Medical Center, Chang Gung Memorial Hospital, Taoyuan, Taiwan
- College of Medicine, Chang Gung University, Taoyuan, Taiwan
| | | | - Chi-Huan Wu
- Department of Gastroenterology and Hepatology, Linkou Medical Center, Chang Gung Memorial Hospital, Taoyuan, Taiwan
- College of Medicine, Chang Gung University, Taoyuan, Taiwan
| | - Chang-Fu Kuo
- Division of Rheumatology, Allergy, and Immunology, Chang Gung Memorial Hospital- Linkou and Chang Gung University College of Medicine, Taoyuan, Taiwan, ROC
- Center for Artificial Intelligence in Medicine, Chang Gung Memorial Hospital, Taoyuan, Taiwan, ROC
| | - Hao-Tsai Cheng
- Department of Gastroenterology and Hepatology, Linkou Medical Center, Chang Gung Memorial Hospital, Taoyuan, Taiwan
- College of Medicine, Chang Gung University, Taoyuan, Taiwan
- Division of Gastroenterology and Hepatology, Department of Internal Medicine, New Taipei Municipal TuCheng Hospital, New Taipei City, Taiwan
- Graduate Institute of Clinical Medicine, College of Medicine, Chang Gung University, Taoyuan City, Taiwan
| | - Shu-Wei Huang
- Department of Gastroenterology and Hepatology, Linkou Medical Center, Chang Gung Memorial Hospital, Taoyuan, Taiwan
- College of Medicine, Chang Gung University, Taoyuan, Taiwan
- Division of Gastroenterology and Hepatology, Department of Internal Medicine, New Taipei Municipal TuCheng Hospital, New Taipei City, Taiwan
| | - Chieh Lee
- Department of Information and Management, College of Business, National Sun Yat-sen University, Kaohsiung city, Taiwan.
| |
Collapse
|
7
|
Zhang Y, Dong J. MAEF-Net: MLP Attention for Feature Enhancement in U-Net based Medical Image Segmentation Networks. IEEE J Biomed Health Inform 2024; 28:846-857. [PMID: 37976191 DOI: 10.1109/jbhi.2023.3332908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2023]
Abstract
Medical image segmentation plays an important role in diagnosis. Since the introduction of U-Net, numerous advancements have been implemented to enhance its performance and expand its applicability. The advent of Transformers in computer vision has led to the integration of self-attention mechanisms into U-Net, resulting in significant breakthroughs. However, the inherent complexity of Transformers renders these networks computationally demanding and parameter-heavy. Recent studies have demonstrated that multilayer perceptrons (MLPs), with their simpler architecture, can achieve comparable performance to Transformers in natural language processing and computer vision tasks. Building upon these findings, we have enhanced the previously proposed "Enhanced-Feature-Four-Fold-Net" (EF 3-Net) by introducing an MLP-attention block to learn long-range dependencies and expand the receptive field. This enhanced network is termed "MLP-Attention Enhanced-Feature-four-fold-Net", abbreviated as "MAEF-Net". To further enhance accuracy while reducing computational complexity, the proposed network incorporates additional efficient design elements. MAEF-Net was evaluated against several general and specialized medical image segmentation networks using four challenging medical image datasets. The results demonstrate that the proposed network exhibits high computational efficiency and comparable or superior performance to EF 3-Net and several state-of-the-art methods, particularly in segmenting blurry objects.
Collapse
|
8
|
Xiang S, Wei L, Hu K. Lightweight colon polyp segmentation algorithm based on improved DeepLabV3. J Cancer 2024; 15:41-53. [PMID: 38164274 PMCID: PMC10751669 DOI: 10.7150/jca.88684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 10/16/2023] [Indexed: 01/03/2024] Open
Abstract
To address the problems that the current polyp segmentation model is complicated and the segmentation accuracy needs to be further improved, a lightweight polyp segmentation network model Li-DeepLabV3+ is proposed. Firstly, the optimized MobileNetV2 network is used as the backbone network to reduce the model complexity. Secondly, an improved simple pyramid pooling module is used to replace the original Atrous Spatial Pyramid Pooling structure, which improves the model training efficiency of the model while reducing the model parameters. Finally, to enhance the feature representation, in the feature fusion module, the low-level feature and the high-level feature are fused using the improved Unified Attention Fusion Module, which applies both channel and spatial attention to enrich the fused features, thus obtaining more boundary information. The model was combined with transfer learning for training and validation on the CVC-ClinicDB and Kvasir SEG datasets, and the generalization of the model was verified across the datasets. The experiment results show that the Li-DeepLabV3+ model has superior advantages in segmentation accuracy and segmentation speed, and has certain generalization abilities.
Collapse
Affiliation(s)
- Shiyu Xiang
- School of Electrical EngineeringAnhui Polytechnic University, Wuhu 241000, China
| | - Lisheng Wei
- Anhui Key Laboratory of Electric Drive and Control, Wuhu 241000, China
| | - Kaifeng Hu
- The First Affiliated Hospital of Wannan Medical College Wuhu, Wuhu 241001, China
| |
Collapse
|
9
|
Selvaraj J, Umapathy S. CRPU-NET: a deep learning model based semantic segmentation for the detection of colorectal polyp in lower gastrointestinal tract. Biomed Phys Eng Express 2023; 10:015018. [PMID: 38100789 DOI: 10.1088/2057-1976/ad160f] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 12/15/2023] [Indexed: 12/17/2023]
Abstract
Purpose. The objectives of the proposed work are twofold. Firstly, to develop a specialized light weight CRPU-Net for the segmentation of polyps in colonoscopy images. Secondly, to conduct a comparative analysis of the performance of CRPU-Net with implemented state-of-the-art models.Methods. We have utilized two distinct colonoscopy image datasets such as CVC-ColonDB and CVC-ClinicDB. This paper introduces the CRPU-Net, a novel approach for the automated segmentation of polyps in colorectal regions. A comprehensive series of experiments was conducted using the CRPU-Net, and its performance was compared with that of state-of-the-art models such as VGG16, VGG19, U-Net and ResUnet++. Additional analysis such as ablation study, generalizability test and 5-fold cross validation were performed.Results. The CRPU-Net achieved the segmentation accuracy of 96.42% compared to state-of-the-art model like ResUnet++ (90.91%). The Jaccard coefficient of 93.96% and Dice coefficient of 95.77% was obtained by comparing the segmentation performance of the CRPU-Net with ground truth.Conclusion. The CRPU-Net exhibits outstanding performance in Segmentation of polyp and holds promise for integration into colonoscopy devices enabling efficient operation.
Collapse
Affiliation(s)
- Jothiraj Selvaraj
- Department of Biomedical Engineering, College of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Chengalpattu-603203, Tamil Nadu, India
| | - Snekhalatha Umapathy
- Department of Biomedical Engineering, College of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Chengalpattu-603203, Tamil Nadu, India
| |
Collapse
|
10
|
Liu W, Li Z, Li C, Gao H. ECTransNet: An Automatic Polyp Segmentation Network Based on Multi-scale Edge Complementary. J Digit Imaging 2023; 36:2427-2440. [PMID: 37491542 PMCID: PMC10584793 DOI: 10.1007/s10278-023-00885-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 07/13/2023] [Accepted: 07/14/2023] [Indexed: 07/27/2023] Open
Abstract
Colonoscopy is acknowledged as the foremost technique for detecting polyps and facilitating early screening and prevention of colorectal cancer. In clinical settings, the segmentation of polyps from colonoscopy images holds paramount importance as it furnishes critical diagnostic and surgical information. Nevertheless, the precise segmentation of colon polyp images is still a challenging task owing to the varied sizes and morphological features of colon polyps and the indistinct boundary between polyps and mucosa. In this study, we present a novel network architecture named ECTransNet to address the challenges in polyp segmentation. Specifically, we propose an edge complementary module that effectively fuses the differences between features with multiple resolutions. This enables the network to exchange features across different levels and results in a substantial improvement in the edge fineness of the polyp segmentation. Additionally, we utilize a feature aggregation decoder that leverages residual blocks to adaptively fuse high-order to low-order features. This strategy restores local edges in low-order features while preserving the spatial information of targets in high-order features, ultimately enhancing the segmentation accuracy. According to extensive experiments conducted on ECTransNet, the results demonstrate that this method outperforms most state-of-the-art approaches on five publicly available datasets. Specifically, our method achieved mDice scores of 0.901 and 0.923 on the Kvasir-SEG and CVC-ClinicDB datasets, respectively. On the Endoscene, CVC-ColonDB, and ETIS datasets, we obtained mDice scores of 0.907, 0.766, and 0.728, respectively.
Collapse
Affiliation(s)
- Weikang Liu
- School of Electronic and Information Engineering, University of Science and Technology Liaoning, Anshan, 114051, China
| | - Zhigang Li
- School of Electronic and Information Engineering, University of Science and Technology Liaoning, Anshan, 114051, China.
| | - Chunyang Li
- School of Electronic and Information Engineering, University of Science and Technology Liaoning, Anshan, 114051, China
| | - Hongyan Gao
- School of Electronic and Information Engineering, University of Science and Technology Liaoning, Anshan, 114051, China
| |
Collapse
|
11
|
Lee GE, Cho J, Choi SI. Shallow and reverse attention network for colon polyp segmentation. Sci Rep 2023; 13:15243. [PMID: 37709828 PMCID: PMC10502036 DOI: 10.1038/s41598-023-42436-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Accepted: 09/10/2023] [Indexed: 09/16/2023] Open
Abstract
Polyp segmentation is challenging because the boundary between polyps and mucosa is ambiguous. Several models have considered the use of attention mechanisms to solve this problem. However, these models use only finite information obtained from a single type of attention. We propose a new dual-attention network based on shallow and reverse attention modules for colon polyps segmentation called SRaNet. The shallow attention mechanism removes background noise while emphasizing the locality by focusing on the foreground. In contrast, reverse attention helps distinguish the boundary between polyps and mucous membranes more clearly by focusing on the background. The two attention mechanisms are adaptively fused using a "Softmax Gate". Combining the two types of attention enables the model to capture complementary foreground and boundary features. Therefore, the proposed model predicts the boundaries of polyps more accurately than other models. We present the results of extensive experiments on polyp benchmarks to show that the proposed method outperforms existing models on both seen and unseen data. Furthermore, the results show that the proposed dual attention module increases the explainability of the model.
Collapse
Affiliation(s)
- Go-Eun Lee
- Department of Computer Science and Engineering, Dankook University, Yongin, 16890, South Korea
| | - Jungchan Cho
- School of Computing, Gachon University, Seongnam, 13120, South Korea.
| | - Sang-Ii Choi
- Department of Computer Science and Engineering, Dankook University, Yongin, 16890, South Korea.
| |
Collapse
|
12
|
Liu W, Li Z, Xia J, Li C. MCSF-Net: a multi-scale channel spatial fusion network for real-time polyp segmentation. Phys Med Biol 2023; 68:175041. [PMID: 37582393 DOI: 10.1088/1361-6560/acf090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Accepted: 08/15/2023] [Indexed: 08/17/2023]
Abstract
Colorectal cancer is a globally prevalent cancer type that necessitates prompt screening. Colonoscopy is the established diagnostic technique for identifying colorectal polyps. However, missed polyp rates remain a concern. Early detection of polyps, while still precancerous, is vital for minimizing cancer-related mortality and economic impact. In the clinical setting, precise segmentation of polyps from colonoscopy images can provide valuable diagnostic and surgical information. Recent advances in computer-aided diagnostic systems, specifically those based on deep learning techniques, have shown promise in improving the detection rates of missed polyps, and thereby assisting gastroenterologists in improving polyp identification. In the present investigation, we introduce MCSF-Net, a real-time automatic segmentation framework that utilizes a multi-scale channel space fusion network. The proposed architecture leverages a multi-scale fusion module in conjunction with spatial and channel attention mechanisms to effectively amalgamate high-dimensional multi-scale features. Additionally, a feature complementation module is employed to extract boundary cues from low-dimensional features, facilitating enhanced representation of low-level features while keeping computational complexity to a minimum. Furthermore, we incorporate shape blocks to facilitate better model supervision for precise identification of boundary features of polyps. Our extensive evaluation of the proposed MCSF-Net on five publicly available benchmark datasets reveals that it outperforms several existing state-of-the-art approaches with respect to different evaluation metrics. The proposed approach runs at an impressive ∼45 FPS, demonstrating notable advantages in terms of scalability and real-time segmentation.
Collapse
Affiliation(s)
- Weikang Liu
- School of Electronic and Information Engineering, University of Science and Technology Liaoning, Anshan, 114051, People's Republic of China
| | - Zhigang Li
- School of Electronic and Information Engineering, University of Science and Technology Liaoning, Anshan, 114051, People's Republic of China
| | - Jiaao Xia
- School of Electronic and Information Engineering, University of Science and Technology Liaoning, Anshan, 114051, People's Republic of China
| | - Chunyang Li
- School of Electronic and Information Engineering, University of Science and Technology Liaoning, Anshan, 114051, People's Republic of China
| |
Collapse
|
13
|
Tomar NK, Shergill A, Rieders B, Bagci U, Jha D. TransResU-Net: A Transformer based ResU-Net for Real-Time Colon Polyp Segmentation. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023; 2023:1-4. [PMID: 38083589 DOI: 10.1109/embc40787.2023.10340572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
Colorectal cancer (CRC) is one of the most common causes of cancer and cancer-related mortality worldwide. Performing colon cancer screening in a timely fashion is the key to early detection. Colonoscopy is the primary modality used to diagnose colon cancer. However, the miss rate of polyps, adenomas and advanced adenomas remains significantly high. Early detection of polyps at the precancerous stage can help reduce the mortality rate and the economic burden associated with colorectal cancer. Deep learning-based computer-aided diagnosis (CADx) system may help gastroenterologists to identify polyps that may otherwise be missed, thereby improving the polyp detection rate. Additionally, CADx system could prove to be a cost-effective system that improves long-term colorectal cancer prevention. In this study, we proposed a deep learning-based architecture for automatic polyp segmentation called Transformer ResU-Net (TransResU-Net). Our proposed architecture is built upon residual blocks with ResNet-50 as the backbone and takes advantage of the transformer self-attention mechanism as well as dilated convolution(s). Our experimental results on two publicly available polyp segmentation benchmark datasets showed that TransResU-Net obtained a highly promising dice score and a real-time speed. With high efficacy in our performance metrics, we concluded that TransResU-Net could be a strong benchmark for building a real-time polyp detection system for the early diagnosis, treatment, and prevention of colorectal cancer. The source code of the proposed TransResU-Net is publicly available at https://github.com/nikhilroxtomar/TransResUNet.
Collapse
|
14
|
Zhu J, Ge M, Chang Z, Dong W. CRCNet: Global-local context and multi-modality cross attention for polyp segmentation. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
|
15
|
Ali S. Where do we stand in AI for endoscopic image analysis? Deciphering gaps and future directions. NPJ Digit Med 2022; 5:184. [PMID: 36539473 PMCID: PMC9767933 DOI: 10.1038/s41746-022-00733-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Accepted: 11/29/2022] [Indexed: 12/24/2022] Open
Abstract
Recent developments in deep learning have enabled data-driven algorithms that can reach human-level performance and beyond. The development and deployment of medical image analysis methods have several challenges, including data heterogeneity due to population diversity and different device manufacturers. In addition, more input from experts is required for a reliable method development process. While the exponential growth in clinical imaging data has enabled deep learning to flourish, data heterogeneity, multi-modality, and rare or inconspicuous disease cases still need to be explored. Endoscopy being highly operator-dependent with grim clinical outcomes in some disease cases, reliable and accurate automated system guidance can improve patient care. Most designed methods must be more generalisable to the unseen target data, patient population variability, and variable disease appearances. The paper reviews recent works on endoscopic image analysis with artificial intelligence (AI) and emphasises the current unmatched needs in this field. Finally, it outlines the future directions for clinically relevant complex AI solutions to improve patient outcomes.
Collapse
Affiliation(s)
- Sharib Ali
- School of Computing, University of Leeds, LS2 9JT, Leeds, UK.
| |
Collapse
|