51
|
Zhang W, Lu F, Su H, Hu Y. Dual-branch multi-information aggregation network with transformer and convolution for polyp segmentation. Comput Biol Med 2024; 168:107760. [PMID: 38064849 DOI: 10.1016/j.compbiomed.2023.107760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 10/21/2023] [Accepted: 11/21/2023] [Indexed: 01/10/2024]
Abstract
Computer-Aided Diagnosis (CAD) for polyp detection offers one of the most notable showcases. By using deep learning technologies, the accuracy of polyp segmentation is surpassing human experts. In such CAD process, a critical step is concerned with segmenting colorectal polyps from colonoscopy images. Despite remarkable successes attained by recent deep learning related works, much improvement is still anticipated to tackle challenging cases. For instance, the effects of motion blur and light reflection can introduce significant noise into the image. The same type of polyps has a diversity of size, color and texture. To address such challenges, this paper proposes a novel dual-branch multi-information aggregation network (DBMIA-Net) for polyp segmentation, which is able to accurately and reliably segment a variety of colorectal polyps with efficiency. Specifically, a dual-branch encoder with transformer and convolutional neural networks (CNN) is employed to extract polyp features, and two multi-information aggregation modules are applied in the decoder to fuse multi-scale features adaptively. Two multi-information aggregation modules include global information aggregation (GIA) module and edge information aggregation (EIA) module. In addition, to enhance the representation learning capability of the potential channel feature association, this paper also proposes a novel adaptive channel graph convolution (ACGC). To validate the effectiveness and advantages of the proposed network, we compare it with several state-of-the-art (SOTA) methods on five public datasets. Experimental results consistently demonstrate that the proposed DBMIA-Net obtains significantly superior segmentation performance across six popularly used evaluation matrices. Especially, we achieve 94.12% mean Dice on CVC-ClinicDB dataset which is 4.22% improvement compared to the previous state-of-the-art method PraNet. Compared with SOTA algorithms, DBMIA-Net has a better fitting ability and stronger generalization ability.
Collapse
Affiliation(s)
- Wenyu Zhang
- School of Information Science and Engineering, Lanzhou University, China
| | - Fuxiang Lu
- School of Information Science and Engineering, Lanzhou University, China.
| | - Hongjing Su
- School of Information Science and Engineering, Lanzhou University, China
| | - Yawen Hu
- School of Information Science and Engineering, Lanzhou University, China
| |
Collapse
|
52
|
Jain S, Atale R, Gupta A, Mishra U, Seal A, Ojha A, Jaworek-Korjakowska J, Krejcar O. CoInNet: A Convolution-Involution Network With a Novel Statistical Attention for Automatic Polyp Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3987-4000. [PMID: 37768798 DOI: 10.1109/tmi.2023.3320151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/30/2023]
Abstract
Polyps are very common abnormalities in human gastrointestinal regions. Their early diagnosis may help in reducing the risk of colorectal cancer. Vision-based computer-aided diagnostic systems automatically identify polyp regions to assist surgeons in their removal. Due to their varying shape, color, size, texture, and unclear boundaries, polyp segmentation in images is a challenging problem. Existing deep learning segmentation models mostly rely on convolutional neural networks that have certain limitations in learning the diversity in visual patterns at different spatial locations. Further, they fail to capture inter-feature dependencies. Vision transformer models have also been deployed for polyp segmentation due to their powerful global feature extraction capabilities. But they too are supplemented by convolution layers for learning contextual local information. In the present paper, a polyp segmentation model CoInNet is proposed with a novel feature extraction mechanism that leverages the strengths of convolution and involution operations and learns to highlight polyp regions in images by considering the relationship between different feature maps through a statistical feature attention unit. To further aid the network in learning polyp boundaries, an anomaly boundary approximation module is introduced that uses recursively fed feature fusion to refine segmentation results. It is indeed remarkable that even tiny-sized polyps with only 0.01% of an image area can be precisely segmented by CoInNet. It is crucial for clinical applications, as small polyps can be easily overlooked even in the manual examination due to the voluminous size of wireless capsule endoscopy videos. CoInNet outperforms thirteen state-of-the-art methods on five benchmark polyp segmentation datasets.
Collapse
|
53
|
Xu S, Duan L, Zhang Y, Zhang Z, Sun T, Tian L. Graph- and transformer-guided boundary aware network for medical image segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 242:107849. [PMID: 37837887 DOI: 10.1016/j.cmpb.2023.107849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 09/29/2023] [Accepted: 10/06/2023] [Indexed: 10/16/2023]
Abstract
BACKGROUND AND OBJECTIVE Despite the considerable progress achieved by U-Net-based models, medical image segmentation remains a challenging task due to complex backgrounds, irrelevant noises, and ambiguous boundaries. In this study, we present a novel approach called U-shaped Graph- and Transformer-guided Boundary Aware Network (GTBA-Net) to tackle these challenges. METHODS GTBA-Net uses the pre-trained ResNet34 as its basic structure, and involves Global Feature Aggregation (GFA) modules for target localization, Graph-based Dynamic Feature Fusion (GDFF) modules for effective noise suppression, and Uncertainty-based Boundary Refinement (UBR) modules for accurate delineation of ambiguous boundaries. The GFA modules employ an efficient self-attention mechanism to facilitate coarse target localization amidst complex backgrounds, without introducing additional computational complexity. The GDFF modules leverage graph attention mechanism to aggregate information hidden among high- and low-level features, effectively suppressing target-irrelevant noises while preserving valuable spatial details. The UBR modules introduce an uncertainty quantification strategy and auxiliary loss to guide the model's focus towards target regions and uncertain "ridges", gradually mitigating boundary uncertainty and ultimately achieving accurate boundary delineation. RESULTS Comparative experiments on five datasets encompassing diverse modalities (including X-ray, CT, endoscopic procedures, and ultrasound) demonstrate that the proposed GTBA-Net outperforms existing methods in various challenging scenarios. Subsequent ablation studies further demonstrate the efficacy of the GFA, GDFF, and UBR modules in target localization, noise suppression, and ambiguous boundary delineation, respectively. CONCLUSIONS GTBA-Net exhibits substantial potential for extensive application in the field of medical image segmentation, particularly in scenarios involving complex backgrounds, target-irrelevant noises, or ambiguous boundaries.
Collapse
Affiliation(s)
- Shanshan Xu
- School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China; Beijing Key Laboratory of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing 100044, China
| | - Lianhong Duan
- The Second School of Clinical Medicine, Southern Medical University, Guangzhou, China; Senior Department of Orthopedics, The Fourth Medical Center of PLA General Hospital, Beijing, China
| | - Yang Zhang
- Senior Department of Orthopedics, The Fourth Medical Center of PLA General Hospital, Beijing, China
| | - Zhicheng Zhang
- Senior Department of Orthopedics, The Fourth Medical Center of PLA General Hospital, Beijing, China
| | - Tiansheng Sun
- The Second School of Clinical Medicine, Southern Medical University, Guangzhou, China; Senior Department of Orthopedics, The Fourth Medical Center of PLA General Hospital, Beijing, China.
| | - Lixia Tian
- School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China.
| |
Collapse
|
54
|
Samarasena J, Yang D, Berzin TM. AGA Clinical Practice Update on the Role of Artificial Intelligence in Colon Polyp Diagnosis and Management: Commentary. Gastroenterology 2023; 165:1568-1573. [PMID: 37855759 DOI: 10.1053/j.gastro.2023.07.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 06/06/2023] [Accepted: 07/17/2023] [Indexed: 10/20/2023]
Abstract
DESCRIPTION The purpose of this American Gastroenterological Association (AGA) Institute Clinical Practice Update (CPU) is to review the available evidence and provide expert commentary on the current landscape of artificial intelligence in the evaluation and management of colorectal polyps. METHODS This CPU was commissioned and approved by the AGA Institute Clinical Practice Updates Committee (CPUC) and the AGA Governing Board to provide timely guidance on a topic of high clinical importance to the AGA membership and underwent internal peer review by the CPUC and external peer review through standard procedures of Gastroenterology. This Expert Commentary incorporates important as well as recently published studies in this field, and it reflects the experiences of the authors who are experienced endoscopists with expertise in the field of artificial intelligence and colorectal polyps.
Collapse
Affiliation(s)
- Jason Samarasena
- Division of Gastroenterology, University of California Irvine, Orange, California
| | - Dennis Yang
- Center for Interventional Endoscopy, AdventHealth, Orlando, Florida.
| | - Tyler M Berzin
- Center for Advanced Endoscopy, Beth Israel Deaconess Medical Center and Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
55
|
Chen W, Zhang R, Zhang Y, Bao F, Lv H, Li L, Zhang C. Pact-Net: Parallel CNNs and Transformers for medical image segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 242:107782. [PMID: 37690317 DOI: 10.1016/j.cmpb.2023.107782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/18/2023] [Revised: 07/20/2023] [Accepted: 08/28/2023] [Indexed: 09/12/2023]
Abstract
BACKGROUND AND OBJECTIVE The image segmentation of diseases can help clinical diagnosis and treatment in medical image analysis. Because medical images usually have low contrast and large changes in the size and shape of some structures, this will lead to over-segmentation and under-segmentation. These problems are particularly evident in the segmentation of skin damage. The blurring of the boundary in skin images and the specificity of patients will further increase the difficulty of skin lesion segmentation. Currently, most researchers use deep learning networks to solve these skin segmentation problems. However, traditional convolution methods often fail to obtain satisfactory segmentation performance due to their shortcomings in obtaining global features. Recently, Transformers with good global information extraction ability has achieved satisfactory results in computer vision, which brings new solutions to optimize the model of medical image segmentation further. METHODS To extract more features related to medical image segmentation and effectively use features to further optimize the skin image segmentation model, we designed a network that combines CNNs and Transformers to improve local and global features, called Parallel CNNs and Transformers for Medical Image Segmentation (Pact-Net). Specifically, due to the advantages of Transformers in extracting global information, we create a novel fusion module CSMF, which uses channel and spatial attention mechanism and multi-scale mechanism to effectively fuse the global information extracted by Transformers into the local features extracted by CNNs. Therefore, our Pact-Net dual-branch runs in parallel to effectively capture global and local information. RESULTS Our Pact-Net exceeds the models submitted on the three datasets ISIC 2016, ISIC 2017 and ISIC 2018, and the indicators required for the datasets reach 86.95%, 79.31% and 84.14%, respectively. We also conducted medical image segmentation experiments on cell and polyp datasets to evaluate the robustness, learning and generalization ability of the network. The ablation study of each part of Pact-Net proves the validity of each component, and the comparison with state-of-the-art methods on different indicators proves the predominance of the network. CONCLUSIONS This paper uses the advantages of CNNs and Transformers in extracting local and global features, and further integrates features for skin lesion segmentation. Compared with the state-of-the-art methods, Pact-Net can achieve the most advanced segmentation ability on the skin lesion segmentation dataset, which can help doctors diagnose and treat diseases.
Collapse
Affiliation(s)
- Weilin Chen
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, Shandong, 250014, China
| | - Rui Zhang
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, Shandong, 250014, China
| | - Yunfeng Zhang
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, Shandong, 250014, China.
| | - Fangxun Bao
- School of Mathematics, Shandong University, Jinan, Shandong, 250100, China
| | - Haixia Lv
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, Shandong, 250014, China
| | - Longhao Li
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, Shandong, 250014, China
| | - Caiming Zhang
- School of Software, Shandong University, Jinan, Shandong, 250101, China; Shandong Co-Innovation Center of Future Intelligent Computing, Yantai, Shandong, 264025, China
| |
Collapse
|
56
|
Zhu S, Gao J, Liu L, Yin M, Lin J, Xu C, Xu C, Zhu J. Public Imaging Datasets of Gastrointestinal Endoscopy for Artificial Intelligence: a Review. J Digit Imaging 2023; 36:2578-2601. [PMID: 37735308 PMCID: PMC10584770 DOI: 10.1007/s10278-023-00844-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 05/03/2023] [Accepted: 05/03/2023] [Indexed: 09/23/2023] Open
Abstract
With the advances in endoscopic technologies and artificial intelligence, a large number of endoscopic imaging datasets have been made public to researchers around the world. This study aims to review and introduce these datasets. An extensive literature search was conducted to identify appropriate datasets in PubMed, and other targeted searches were conducted in GitHub, Kaggle, and Simula to identify datasets directly. We provided a brief introduction to each dataset and evaluated the characteristics of the datasets included. Moreover, two national datasets in progress were discussed. A total of 40 datasets of endoscopic images were included, of which 34 were accessible for use. Basic and detailed information on each dataset was reported. Of all the datasets, 16 focus on polyps, and 6 focus on small bowel lesions. Most datasets (n = 16) were constructed by colonoscopy only, followed by normal gastrointestinal endoscopy and capsule endoscopy (n = 9). This review may facilitate the usage of public dataset resources in endoscopic research.
Collapse
Affiliation(s)
- Shiqi Zhu
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou , Jiangsu, 215000, China
- Suzhou Clinical Center of Digestive Diseases, Suzhou, 215000, China
| | - Jingwen Gao
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou , Jiangsu, 215000, China
- Suzhou Clinical Center of Digestive Diseases, Suzhou, 215000, China
| | - Lu Liu
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou , Jiangsu, 215000, China
- Suzhou Clinical Center of Digestive Diseases, Suzhou, 215000, China
| | - Minyue Yin
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou , Jiangsu, 215000, China
- Suzhou Clinical Center of Digestive Diseases, Suzhou, 215000, China
| | - Jiaxi Lin
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou , Jiangsu, 215000, China
- Suzhou Clinical Center of Digestive Diseases, Suzhou, 215000, China
| | - Chang Xu
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou , Jiangsu, 215000, China
- Suzhou Clinical Center of Digestive Diseases, Suzhou, 215000, China
| | - Chunfang Xu
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou , Jiangsu, 215000, China.
- Suzhou Clinical Center of Digestive Diseases, Suzhou, 215000, China.
| | - Jinzhou Zhu
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou , Jiangsu, 215000, China.
- Suzhou Clinical Center of Digestive Diseases, Suzhou, 215000, China.
| |
Collapse
|
57
|
Mu N, Guo J, Wang R. Automated polyp segmentation based on a multi-distance feature dissimilarity-guided fully convolutional network. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:20116-20134. [PMID: 38052639 DOI: 10.3934/mbe.2023891] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Colorectal malignancies often arise from adenomatous polyps, which typically begin as solitary, asymptomatic growths before progressing to malignancy. Colonoscopy is widely recognized as a highly efficacious clinical polyp detection method, offering valuable visual data that facilitates precise identification and subsequent removal of these tumors. Nevertheless, accurately segmenting individual polyps poses a considerable difficulty because polyps exhibit intricate and changeable characteristics, including shape, size, color, quantity and growth context during different stages. The presence of similar contextual structures around polyps significantly hampers the performance of commonly used convolutional neural network (CNN)-based automatic detection models to accurately capture valid polyp features, and these large receptive field CNN models often overlook the details of small polyps, which leads to the occurrence of false detections and missed detections. To tackle these challenges, we introduce a novel approach for automatic polyp segmentation, known as the multi-distance feature dissimilarity-guided fully convolutional network. This approach comprises three essential components, i.e., an encoder-decoder, a multi-distance difference (MDD) module and a hybrid loss (HL) module. Specifically, the MDD module primarily employs a multi-layer feature subtraction (MLFS) strategy to propagate features from the encoder to the decoder, which focuses on extracting information differences between neighboring layers' features at short distances, and both short and long-distance feature differences across layers. Drawing inspiration from pyramids, the MDD module effectively acquires discriminative features from neighboring layers or across layers in a continuous manner, which helps to strengthen feature complementary across different layers. The HL module is responsible for supervising the feature maps extracted at each layer of the network to improve prediction accuracy. Our experimental results on four challenge datasets demonstrate that the proposed approach exhibits superior automatic polyp performance in terms of the six evaluation criteria compared to five current state-of-the-art approaches.
Collapse
Affiliation(s)
- Nan Mu
- College of Computer Science, Sichuan Normal University, Chengdu 610101, China
- Visual Computing and Virtual Reality Key Laboratory of Sichuan, Sichuan Normal University, Chengdu 610068, China
- Education Big Data Collaborative Innovation Center of Sichuan 2011, Chengdu 610101, China
| | - Jinjia Guo
- Chongqing University-University of Cincinnati Joint Co-op Institution, Chongqing University, Chongqing 400044, China
| | - Rong Wang
- College of Computer Science, Sichuan Normal University, Chengdu 610101, China
- Visual Computing and Virtual Reality Key Laboratory of Sichuan, Sichuan Normal University, Chengdu 610068, China
- Education Big Data Collaborative Innovation Center of Sichuan 2011, Chengdu 610101, China
| |
Collapse
|
58
|
Chen F, Ma H, Zhang W. SegT: Separated edge-guidance transformer network for polyp segmentation. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:17803-17821. [PMID: 38052537 DOI: 10.3934/mbe.2023791] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Accurate segmentation of colonoscopic polyps is considered a fundamental step in medical image analysis and surgical interventions. Many recent studies have made improvements based on the encoder-decoder framework, which can effectively segment diverse polyps. Such improvements mainly aim to enhance local features by using global features and applying attention methods. However, relying only on the global information of the final encoder block can result in losing local regional features in the intermediate layer. In addition, determining the edges between benign regions and polyps could be a challenging task. To address the aforementioned issues, we propose a novel separated edge-guidance transformer (SegT) network that aims to build an effective polyp segmentation model. A transformer encoder that learns a more robust representation than existing convolutional neural network-based approaches was specifically applied. To determine the precise segmentation of polyps, we utilize a separated edge-guidance module consisting of separator and edge-guidance blocks. The separator block is a two-stream operator to highlight edges between the background and foreground, whereas the edge-guidance block lies behind both streams to strengthen the understanding of the edge. Lastly, an innovative cascade fusion module was used and fused the refined multi-level features. To evaluate the effectiveness of SegT, we conducted experiments with five challenging public datasets, and the proposed model achieved state-of-the-art performance.
Collapse
Affiliation(s)
- Feiyu Chen
- Department of Mathematics, Physics and Information Sciences, Shaoxing University, Shaoxing, China
| | - Haiping Ma
- Department of Mathematics, Physics and Information Sciences, Shaoxing University, Shaoxing, China
| | - Weijia Zhang
- Department of Mathematics, Physics and Information Sciences, Shaoxing University, Shaoxing, China
- Department of AOP Physics, Visiting Scholar, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
59
|
Lee GE, Cho J, Choi SI. Shallow and reverse attention network for colon polyp segmentation. Sci Rep 2023; 13:15243. [PMID: 37709828 PMCID: PMC10502036 DOI: 10.1038/s41598-023-42436-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Accepted: 09/10/2023] [Indexed: 09/16/2023] Open
Abstract
Polyp segmentation is challenging because the boundary between polyps and mucosa is ambiguous. Several models have considered the use of attention mechanisms to solve this problem. However, these models use only finite information obtained from a single type of attention. We propose a new dual-attention network based on shallow and reverse attention modules for colon polyps segmentation called SRaNet. The shallow attention mechanism removes background noise while emphasizing the locality by focusing on the foreground. In contrast, reverse attention helps distinguish the boundary between polyps and mucous membranes more clearly by focusing on the background. The two attention mechanisms are adaptively fused using a "Softmax Gate". Combining the two types of attention enables the model to capture complementary foreground and boundary features. Therefore, the proposed model predicts the boundaries of polyps more accurately than other models. We present the results of extensive experiments on polyp benchmarks to show that the proposed method outperforms existing models on both seen and unseen data. Furthermore, the results show that the proposed dual attention module increases the explainability of the model.
Collapse
Affiliation(s)
- Go-Eun Lee
- Department of Computer Science and Engineering, Dankook University, Yongin, 16890, South Korea
| | - Jungchan Cho
- School of Computing, Gachon University, Seongnam, 13120, South Korea.
| | - Sang-Ii Choi
- Department of Computer Science and Engineering, Dankook University, Yongin, 16890, South Korea.
| |
Collapse
|
60
|
Shukla S, Birla L, Gupta AK, Gupta P. Trustworthy Medical Image Segmentation with improved performance for in-distribution samples. Neural Netw 2023; 166:127-136. [PMID: 37487410 DOI: 10.1016/j.neunet.2023.06.047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 06/13/2023] [Accepted: 06/30/2023] [Indexed: 07/26/2023]
Abstract
Despite the enormous achievements of Deep Learning (DL) based models, their non-transparent nature led to restricted applicability and distrusted predictions. Such predictions emerge from erroneous In-Distribution (ID) and Out-Of-Distribution (OOD) samples, which results in disastrous effects in the medical domain, specifically in Medical Image Segmentation (MIS). To mitigate such effects, several existing works accomplish OOD sample detection; however, the trustworthiness issues from ID samples still require thorough investigation. To this end, a novel method TrustMIS (Trustworthy Medical Image Segmentation) is proposed in this paper, which provides the trustworthiness and improved performance of ID samples for DL-based MIS models. TrustMIS works in three folds: IT (Investigating Trustworthiness), INT (Improving Non-Trustworthy prediction) and CSO (Classifier Switching Operation). Initially, the IT method investigates the trustworthiness of MIS by leveraging similar characteristics and consistency analysis of input and its variants. Subsequently, the INT method employs the IT method to improve the performance of the MIS model. It leverages the observation that an input providing erroneous segmentation can provide correct segmentation with rotated input. Eventually, the CSO method employs the INT method to scrutinise several MIS models and selects the model that delivers the most trustworthy prediction. The experiments conducted on publicly available datasets using well-known MIS models reveal that TrustMIS has successfully provided a trustworthiness measure, outperformed the existing methods, and improved the performance of state-of-the-art MIS models. Our implementation is available at https://github.com/SnehaShukla937/TrustMIS.
Collapse
Affiliation(s)
- Sneha Shukla
- Indian Institute of Technology Indore, Indore, India.
| | | | | | - Puneet Gupta
- Indian Institute of Technology Indore, Indore, India.
| |
Collapse
|
61
|
Yang L, Zhai C, Liu Y, Yu H. CFHA-Net: A polyp segmentation method with cross-scale fusion strategy and hybrid attention. Comput Biol Med 2023; 164:107301. [PMID: 37573723 DOI: 10.1016/j.compbiomed.2023.107301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 07/10/2023] [Accepted: 07/28/2023] [Indexed: 08/15/2023]
Abstract
Colorectal cancer is a prevalent disease in modern times, with most cases being caused by polyps. Therefore, the segmentation of polyps has garnered significant attention in the field of medical image segmentation. In recent years, the variant network derived from the U-Net network has demonstrated a good segmentation effect on polyp segmentation challenges. In this paper, a polyp segmentation model, called CFHA-Net, is proposed, that combines a cross-scale feature fusion strategy and a hybrid attention mechanism. Inspired by feature learning, the encoder unit incorporates a cross-scale context fusion (CCF) module that performs cross-layer feature fusion and enhances the feature information of different scales. The skip connection is optimized by proposed triple hybrid attention (THA) module that aggregates spatial and channel attention features from three directions to improve the long-range dependence between features and help identify subsequent polyp lesion boundaries. Additionally, a dense-receptive feature fusion (DFF) module, which combines dense connections and multi-receptive field fusion modules, is added at the bottleneck layer to capture more comprehensive context information. Furthermore, a hybrid pooling (HP) module and a hybrid upsampling (HU) module are proposed to help the segmentation network acquire more contextual features. A series of experiments have been conducted on three typical datasets for polyp segmentation (CVC-ClinicDB, Kvasir-SEG, EndoTect) to evaluate the effectiveness and generalization of the proposed CFHA-Net. The experimental results demonstrate the validity and generalization of the proposed method, with many performance metrics surpassing those of related advanced segmentation networks. Therefore, proposed CFHA-Net could present a promising solution to the challenges of polyp segmentation in medical image analysis. The source code of proposed CFHA-Net is available at https://github.com/CXzhai/CFHA-Net.git.
Collapse
Affiliation(s)
- Lei Yang
- School of Electrical and Information Engineering, Zhengzhou University, Henan Province, 450001, China; Robot Perception and Control Engineering Laboratory of Henan Province, 450001, China
| | - Chenxu Zhai
- School of Electrical and Information Engineering, Zhengzhou University, Henan Province, 450001, China; Robot Perception and Control Engineering Laboratory of Henan Province, 450001, China
| | - Yanhong Liu
- School of Electrical and Information Engineering, Zhengzhou University, Henan Province, 450001, China; Robot Perception and Control Engineering Laboratory of Henan Province, 450001, China.
| | - Hongnian Yu
- School of Electrical and Information Engineering, Zhengzhou University, Henan Province, 450001, China; Built Environment, Edinburgh Napier University, Edinburgh EH10 5DT, UK
| |
Collapse
|
62
|
Liu Z, Lv Q, Yang Z, Li Y, Lee CH, Shen L. Recent progress in transformer-based medical image analysis. Comput Biol Med 2023; 164:107268. [PMID: 37494821 DOI: 10.1016/j.compbiomed.2023.107268] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 05/30/2023] [Accepted: 07/16/2023] [Indexed: 07/28/2023]
Abstract
The transformer is primarily used in the field of natural language processing. Recently, it has been adopted and shows promise in the computer vision (CV) field. Medical image analysis (MIA), as a critical branch of CV, also greatly benefits from this state-of-the-art technique. In this review, we first recap the core component of the transformer, the attention mechanism, and the detailed structures of the transformer. After that, we depict the recent progress of the transformer in the field of MIA. We organize the applications in a sequence of different tasks, including classification, segmentation, captioning, registration, detection, enhancement, localization, and synthesis. The mainstream classification and segmentation tasks are further divided into eleven medical image modalities. A large number of experiments studied in this review illustrate that the transformer-based method outperforms existing methods through comparisons with multiple evaluation metrics. Finally, we discuss the open challenges and future opportunities in this field. This task-modality review with the latest contents, detailed information, and comprehensive comparison may greatly benefit the broad MIA community.
Collapse
Affiliation(s)
- Zhaoshan Liu
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore.
| | - Qiujie Lv
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore; School of Intelligent Systems Engineering, Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, 518107, China.
| | - Ziduo Yang
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore; School of Intelligent Systems Engineering, Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, 518107, China.
| | - Yifan Li
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore.
| | - Chau Hung Lee
- Department of Radiology, Tan Tock Seng Hospital, 11 Jalan Tan Tock Seng, Singapore, 308433, Singapore.
| | - Lei Shen
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore.
| |
Collapse
|
63
|
Ghimire R, Lee SW. MMNet: A Mixing Module Network for Polyp Segmentation. SENSORS (BASEL, SWITZERLAND) 2023; 23:7258. [PMID: 37631792 PMCID: PMC10458640 DOI: 10.3390/s23167258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 08/03/2023] [Accepted: 08/16/2023] [Indexed: 08/27/2023]
Abstract
Traditional encoder-decoder networks like U-Net have been extensively used for polyp segmentation. However, such networks have demonstrated limitations in explicitly modeling long-range dependencies. In such networks, local patterns are emphasized over the global context, as each convolutional kernel focuses on only a local subset of pixels in the entire image. Several recent transformer-based networks have been shown to overcome such limitations. Such networks encode long-range dependencies using self-attention methods and thus learn highly expressive representations. However, due to the computational complexity of modeling the whole image, self-attention is expensive to compute, as there is a quadratic increment in cost with the increase in pixels in the image. Thus, patch embedding has been utilized, which groups small regions of the image into single input features. Nevertheless, these transformers still lack inductive bias, even with the image as a 1D sequence of visual tokens. This results in the inability to generalize to local contexts due to limited low-level features. We introduce a hybrid transformer combined with a convolutional mixing network to overcome computational and long-range dependency issues. A pretrained transformer network is introduced as a feature-extracting encoder, and a mixing module network (MMNet) is introduced to capture the long-range dependencies with a reduced computational cost. Precisely, in the mixing module network, we use depth-wise and 1 × 1 convolution to model long-range dependencies to establish spatial and cross-channel correlation, respectively. The proposed approach is evaluated qualitatively and quantitatively on five challenging polyp datasets across six metrics. Our MMNet outperforms the previous best polyp segmentation methods.
Collapse
Affiliation(s)
- Raman Ghimire
- Pattern Recognition and Machine Learning Lab, Department of IT Convergence Engineering, Gachon University, Seongnam 13557, Republic of Korea;
| | - Sang-Woong Lee
- Pattern Recognition and Machine Learning Lab, Department of AI Software, Gachon University, Seongnam 13557, Republic of Korea
| |
Collapse
|
64
|
Shamshad F, Khan S, Zamir SW, Khan MH, Hayat M, Khan FS, Fu H. Transformers in medical imaging: A survey. Med Image Anal 2023; 88:102802. [PMID: 37315483 DOI: 10.1016/j.media.2023.102802] [Citation(s) in RCA: 88] [Impact Index Per Article: 88.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Revised: 03/11/2023] [Accepted: 03/23/2023] [Indexed: 06/16/2023]
Abstract
Following unprecedented success on the natural language tasks, Transformers have been successfully applied to several computer vision problems, achieving state-of-the-art results and prompting researchers to reconsider the supremacy of convolutional neural networks (CNNs) as de facto operators. Capitalizing on these advances in computer vision, the medical imaging field has also witnessed growing interest for Transformers that can capture global context compared to CNNs with local receptive fields. Inspired from this transition, in this survey, we attempt to provide a comprehensive review of the applications of Transformers in medical imaging covering various aspects, ranging from recently proposed architectural designs to unsolved issues. Specifically, we survey the use of Transformers in medical image segmentation, detection, classification, restoration, synthesis, registration, clinical report generation, and other tasks. In particular, for each of these applications, we develop taxonomy, identify application-specific challenges as well as provide insights to solve them, and highlight recent trends. Further, we provide a critical discussion of the field's current state as a whole, including the identification of key challenges, open problems, and outlining promising future directions. We hope this survey will ignite further interest in the community and provide researchers with an up-to-date reference regarding applications of Transformer models in medical imaging. Finally, to cope with the rapid development in this field, we intend to regularly update the relevant latest papers and their open-source implementations at https://github.com/fahadshamshad/awesome-transformers-in-medical-imaging.
Collapse
Affiliation(s)
- Fahad Shamshad
- MBZ University of Artificial Intelligence, Abu Dhabi, United Arab Emirates.
| | - Salman Khan
- MBZ University of Artificial Intelligence, Abu Dhabi, United Arab Emirates; CECS, Australian National University, Canberra ACT 0200, Australia
| | - Syed Waqas Zamir
- Inception Institute of Artificial Intelligence, Abu Dhabi, United Arab Emirates
| | | | - Munawar Hayat
- Faculty of IT, Monash University, Clayton VIC 3800, Australia
| | - Fahad Shahbaz Khan
- MBZ University of Artificial Intelligence, Abu Dhabi, United Arab Emirates; Computer Vision Laboratory, Linköping University, Sweden
| | - Huazhu Fu
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore
| |
Collapse
|
65
|
Xie Y, Yu Y, Liao M, Sun C. Gastric polyp detection module based on improved attentional feature fusion. Biomed Eng Online 2023; 22:72. [PMID: 37468936 DOI: 10.1186/s12938-023-01130-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Accepted: 06/26/2023] [Indexed: 07/21/2023] Open
Abstract
Gastric cancer is a deadly disease and gastric polyps are at high risk of becoming cancerous. Therefore, the timely detection of gastric polyp is of great importance which can reduce the incidence of gastric cancer effectively. At present, the object detection method based on deep learning is widely used in medical images. However, as the contrast between the background and the polyps is not strong in gastroscopic image, it is difficult to distinguish various sizes of polyps from the background. In this paper, to improve the detection performance metrics of endoscopic gastric polyps, we propose an improved attentional feature fusion module. First, in order to enhance the contrast between the background and the polyps, we propose an attention module that enables the network to make full use of the target location information, it can suppress the interference of the background information and highlight the effective features. Therefore, on the basis of accurate positioning, it can focus on detecting whether the current location is the gastric polyp or background. Then, it is combined with our feature fusion module to form a new attentional feature fusion model that can mitigate the effects caused by semantic differences in the processing of feature fusion, using multi-scale fusion information to obtain more accurate attention weights and improve the detection performance of polyps of different sizes. In this work, we conduct experiments on our own dataset of gastric polyps. Experimental results show that the proposed attentional feature fusion module is better than the common feature fusion module and can improve the situation where polyps are missed or misdetected.
Collapse
Affiliation(s)
- Yun Xie
- School of Intelligence Science and technology, University of Science and Technology Beijing, Beijing, China
| | - Yao Yu
- School of Intelligence Science and technology, University of Science and Technology Beijing, Beijing, China.
| | - Mingchao Liao
- School of Intelligence Science and technology, University of Science and Technology Beijing, Beijing, China
| | - Changyin Sun
- School of Artificial Intelligence, Anhui University, Anhui, China
| |
Collapse
|
66
|
Bian H, Jiang M, Qian J. The investigation of constraints in implementing robust AI colorectal polyp detection for sustainable healthcare system. PLoS One 2023; 18:e0288376. [PMID: 37437026 DOI: 10.1371/journal.pone.0288376] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 06/24/2023] [Indexed: 07/14/2023] Open
Abstract
Colorectal cancer (CRC) is one of the significant threats to public health and the sustainable healthcare system during urbanization. As the primary method of screening, colonoscopy can effectively detect polyps before they evolve into cancerous growths. However, the current visual inspection by endoscopists is insufficient in providing consistently reliable polyp detection for colonoscopy videos and images in CRC screening. Artificial Intelligent (AI) based object detection is considered as a potent solution to overcome visual inspection limitations and mitigate human errors in colonoscopy. This study implemented a YOLOv5 object detection model to investigate the performance of mainstream one-stage approaches in colorectal polyp detection. Meanwhile, a variety of training datasets and model structure configurations are employed to identify the determinative factors in practical applications. The designed experiments show that the model yields acceptable results assisted by transfer learning, and highlight that the primary constraint in implementing deep learning polyp detection comes from the scarcity of training data. The model performance was improved by 15.6% in terms of average precision (AP) when the original training dataset was expanded. Furthermore, the experimental results were analysed from a clinical perspective to identify potential causes of false positives. Besides, the quality management framework is proposed for future dataset preparation and model development in AI-driven polyp detection tasks for smart healthcare solutions.
Collapse
Affiliation(s)
- Haitao Bian
- College of Safety Science and Engineering, Nanjing Tech University, Nanjing, Jiangsu, China
| | - Min Jiang
- KLA Corporation, Milpitas, California, United States of America
| | - Jingjing Qian
- Department of Gastroenterology, The Second Hospital of Nanjing, Nanjing University of Chinese Medicine, Nanjing, Jiangsu, China
| |
Collapse
|
67
|
Wang J, Tian S, Yu L, Zhou Z, Wang F, Wang Y. HIGF-Net: Hierarchical information-guided fusion network for polyp segmentation based on transformer and convolution feature learning. Comput Biol Med 2023; 161:107038. [PMID: 37230017 DOI: 10.1016/j.compbiomed.2023.107038] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 01/22/2023] [Accepted: 05/11/2023] [Indexed: 05/27/2023]
Abstract
Polyp segmentation plays a role in image analysis during colonoscopy screening, thus improving the diagnostic efficiency of early colorectal cancer. However, due to the variable shape and size characteristics of polyps, small difference between lesion area and background, and interference of image acquisition conditions, existing segmentation methods have the phenomenon of missing polyp and rough boundary division. To overcome the above challenges, we propose a multi-level fusion network called HIGF-Net, which uses hierarchical guidance strategy to aggregate rich information to produce reliable segmentation results. Specifically, our HIGF-Net excavates deep global semantic information and shallow local spatial features of images together with Transformer encoder and CNN encoder. Then, Double-stream structure is used to transmit polyp shape properties between feature layers at different depths. The module calibrates the position and shape of polyps in different sizes to improve the model's efficient use of the rich polyp features. In addition, Separate Refinement module refines the polyp profile in the uncertain region to highlight the difference between the polyp and the background. Finally, in order to adapt to diverse collection environments, Hierarchical Pyramid Fusion module merges the features of multiple layers with different representational capabilities. We evaluate the learning and generalization abilities of HIGF-Net on five datasets using six evaluation metrics, including Kvasir-SEG, CVC-ClinicDB, ETIS, CVC-300, and CVC-ColonDB. Experimental results show that the proposed model is effective in polyp feature mining and lesion identification, and its segmentation performance is better than ten excellent models.
Collapse
Affiliation(s)
- Junwen Wang
- College of Software, Xinjiang University, Urumqi, 830000, China; Key Laboratory of Software Engineering Technology, Xinjiang University, Urumqi, 830000, China
| | - Shengwei Tian
- College of Software, Xinjiang University, Urumqi, 830000, China; Key Laboratory of Software Engineering Technology, Xinjiang University, Urumqi, 830000, China.
| | - Long Yu
- College of Network Center, Xinjiang University, Urumqi, 830000, China; Signal and Signal Processing Laboratory, College of Information Science and Engineering, Xinjiang University, Urumqi, 830000, China
| | - Zhicheng Zhou
- College of Software, Xinjiang University, Urumqi, 830000, China; Key Laboratory of Software Engineering Technology, Xinjiang University, Urumqi, 830000, China
| | - Fan Wang
- College of Software, Xinjiang University, Urumqi, 830000, China; Key Laboratory of Software Engineering Technology, Xinjiang University, Urumqi, 830000, China
| | - Yongtao Wang
- College of Software, Xinjiang University, Urumqi, 830000, China; Key Laboratory of Software Engineering Technology, Xinjiang University, Urumqi, 830000, China
| |
Collapse
|
68
|
Nanni L, Fantozzi C, Loreggia A, Lumini A. Ensembles of Convolutional Neural Networks and Transformers for Polyp Segmentation. SENSORS (BASEL, SWITZERLAND) 2023; 23:4688. [PMID: 37430601 DOI: 10.3390/s23104688] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 04/29/2023] [Accepted: 05/09/2023] [Indexed: 07/12/2023]
Abstract
In the realm of computer vision, semantic segmentation is the task of recognizing objects in images at the pixel level. This is done by performing a classification of each pixel. The task is complex and requires sophisticated skills and knowledge about the context to identify objects' boundaries. The importance of semantic segmentation in many domains is undisputed. In medical diagnostics, it simplifies the early detection of pathologies, thus mitigating the possible consequences. In this work, we provide a review of the literature on deep ensemble learning models for polyp segmentation and develop new ensembles based on convolutional neural networks and transformers. The development of an effective ensemble entails ensuring diversity between its components. To this end, we combined different models (HarDNet-MSEG, Polyp-PVT, and HSNet) trained with different data augmentation techniques, optimization methods, and learning rates, which we experimentally demonstrate to be useful to form a better ensemble. Most importantly, we introduce a new method to obtain the segmentation mask by averaging intermediate masks after the sigmoid layer. In our extensive experimental evaluation, the average performance of the proposed ensembles over five prominent datasets beat any other solution that we know of. Furthermore, the ensembles also performed better than the state-of-the-art on two of the five datasets, when individually considered, without having been specifically trained for them.
Collapse
Affiliation(s)
- Loris Nanni
- Department of Information Engineering, University of Padova, 35122 Padova, Italy
| | - Carlo Fantozzi
- Department of Information Engineering, University of Padova, 35122 Padova, Italy
| | - Andrea Loreggia
- Department of Information Engineering, University of Brescia, 25121 Brescia, Italy
| | - Alessandra Lumini
- Department of Computer Science and Engineering, University of Bologna, 40126 Bologna, Italy
| |
Collapse
|
69
|
Hu K, Chen W, Sun Y, Hu X, Zhou Q, Zheng Z. PPNet: Pyramid pooling based network for polyp segmentation. Comput Biol Med 2023; 160:107028. [PMID: 37201273 DOI: 10.1016/j.compbiomed.2023.107028] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 04/24/2023] [Accepted: 05/09/2023] [Indexed: 05/20/2023]
Abstract
Colonoscopy is the gold standard method for investigating the gastrointestinal tract. Localizing the polyps in colonoscopy images plays a vital role when doing a colonoscopy screening, and it is also quite important for the following treatment, e.g., polyp resection. Many deep learning-based methods have been applied for solving the polyp segmentation issue. However, precisely polyp segmentation is still an open issue. Considering the effectiveness of the Pyramid Pooling Transformer (P2T) in modeling long-range dependencies and capturing robust contextual features, as well as the power of pyramid pooling in extracting features, we propose a pyramid pooling based network for polyp segmentation, namely PPNet. We first adopt the P2T as the encoder for extracting more powerful features. Next, a pyramid feature fusion module (PFFM) combining the channel attention scheme is utilized for learning a global contextual feature, in order to guide the information transition in the decoder branch. Aiming to enhance the effectiveness of PPNet on feature extraction during the decoder stage layer by layer, we introduce the memory-keeping pyramid pooling module (MPPM) into each side branch of the encoder, and transmit the corresponding feature to each lower-level side branch. Experimental results conducted on five public colorectal polyp segmentation datasets are given and discussed. Our method performs better compared with several state-of-the-art polyp extraction networks, which demonstrate the effectiveness of the mechanism of pyramid pooling for colorectal polyp segmentation.
Collapse
Affiliation(s)
- Keli Hu
- Department of Computer Science and Engineering, Shaoxing University, Shaoxing, 312000, PR China; Cancer Center, Department of Gastroenterology, Zhejiang Provincial People's Hospital (Affiliated People's Hospital, Hangzhou Medical College), Hangzhou, 310014, PR China; Information Technology R&D Innovation Center of Peking University, Shaoxing, 312000, PR China
| | - Wenping Chen
- Department of Computer Science and Engineering, Shaoxing University, Shaoxing, 312000, PR China.
| | - YuanZe Sun
- Department of Computer Science and Engineering, Shaoxing University, Shaoxing, 312000, PR China
| | - Xiaozhao Hu
- Shaoxing People's Hospital, Shaoxing, 312000, PR China
| | - Qianwei Zhou
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023, PR China
| | - Zirui Zheng
- Department of Computer Science and Engineering, Shaoxing University, Shaoxing, 312000, PR China
| |
Collapse
|
70
|
Wang KN, Zhuang S, Ran QY, Zhou P, Hua J, Zhou GQ, He X. DLGNet: A dual-branch lesion-aware network with the supervised Gaussian Mixture model for colon lesions classification in colonoscopy images. Med Image Anal 2023; 87:102832. [PMID: 37148864 DOI: 10.1016/j.media.2023.102832] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2022] [Revised: 01/20/2023] [Accepted: 04/20/2023] [Indexed: 05/08/2023]
Abstract
Colorectal cancer is one of the malignant tumors with the highest mortality due to the lack of obvious early symptoms. It is usually in the advanced stage when it is discovered. Thus the automatic and accurate classification of early colon lesions is of great significance for clinically estimating the status of colon lesions and formulating appropriate diagnostic programs. However, it is challenging to classify full-stage colon lesions due to the large inter-class similarities and intra-class differences of the images. In this work, we propose a novel dual-branch lesion-aware neural network (DLGNet) to classify intestinal lesions by exploring the intrinsic relationship between diseases, composed of four modules: lesion location module, dual-branch classification module, attention guidance module, and inter-class Gaussian loss function. Specifically, the elaborate dual-branch module integrates the original image and the lesion patch obtained by the lesion localization module to explore and interact with lesion-specific features from a global and local perspective. Also, the feature-guided module guides the model to pay attention to the disease-specific features by learning remote dependencies through spatial and channel attention after network feature learning. Finally, the inter-class Gaussian loss function is proposed, which assumes that each feature extracted by the network is an independent Gaussian distribution, and the inter-class clustering is more compact, thereby improving the discriminative ability of the network. The extensive experiments on the collected 2568 colonoscopy images have an average accuracy of 91.50%, and the proposed method surpasses the state-of-the-art methods. This study is the first time that colon lesions are classified at each stage and achieves promising colon disease classification performance. To motivate the community, we have made our code publicly available via https://github.com/soleilssss/DLGNet.
Collapse
Affiliation(s)
- Kai-Ni Wang
- School of Biological Science and Medical Engineering, Southeast University, Nanjing, China; State Key Laboratory of Digital Medical Engineering, Southeast University, Nanjing, China; Jiangsu Key Laboratory of Biomaterials and Devices, Southeast University, Nanjing, China
| | - Shuaishuai Zhuang
- The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Qi-Yong Ran
- School of Biological Science and Medical Engineering, Southeast University, Nanjing, China; State Key Laboratory of Digital Medical Engineering, Southeast University, Nanjing, China; Jiangsu Key Laboratory of Biomaterials and Devices, Southeast University, Nanjing, China
| | - Ping Zhou
- School of Biological Science and Medical Engineering, Southeast University, Nanjing, China; State Key Laboratory of Digital Medical Engineering, Southeast University, Nanjing, China; Jiangsu Key Laboratory of Biomaterials and Devices, Southeast University, Nanjing, China
| | - Jie Hua
- The First Affiliated Hospital of Nanjing Medical University, Nanjing, China; Liyang People's Hospital, Liyang Branch Hospital of Jiangsu Province Hospital, Liyang, China
| | - Guang-Quan Zhou
- School of Biological Science and Medical Engineering, Southeast University, Nanjing, China; State Key Laboratory of Digital Medical Engineering, Southeast University, Nanjing, China; Jiangsu Key Laboratory of Biomaterials and Devices, Southeast University, Nanjing, China.
| | - Xiaopu He
- The First Affiliated Hospital of Nanjing Medical University, Nanjing, China.
| |
Collapse
|
71
|
Su Y, Cheng J, Zhong C, Zhang Y, Ye J, He J, Liu J. FeDNet: Feature Decoupled Network for polyp segmentation from endoscopy images. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/19/2023]
|
72
|
Wei X, Ye F, Wan H, Xu J, Min W. TANet: Triple Attention Network for medical image segmentation. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
|
73
|
Dhaliwal J, Walsh CM. Artificial Intelligence in Pediatric Endoscopy: Current Status and Future Applications. Gastrointest Endosc Clin N Am 2023; 33:291-308. [PMID: 36948747 DOI: 10.1016/j.giec.2022.12.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/24/2023]
Abstract
The application of artificial intelligence (AI) has great promise for improving pediatric endoscopy. The majority of preclinical studies have been undertaken in adults, with the greatest progress being made in the context of colorectal cancer screening and surveillance. This development has only been possible with advances in deep learning, like the convolutional neural network model, which has enabled real-time detection of pathology. Comparatively, the majority of deep learning systems developed in inflammatory bowel disease have focused on predicting disease severity and were developed using still images rather than videos. The application of AI to pediatric endoscopy is in its infancy, thus providing an opportunity to develop clinically meaningful and fair systems that do not perpetuate societal biases. In this review, we provide an overview of AI, summarize the advances of AI in endoscopy, and describe its potential application to pediatric endoscopic practice and education.
Collapse
Affiliation(s)
- Jasbir Dhaliwal
- Division of Pediatric Gastroenterology, Hepatology and Nutrition, Cincinnati Children's Hospital Medictal Center, University of Cincinnati, OH, USA.
| | - Catharine M Walsh
- Division of Gastroenterology, Hepatology, and Nutrition, and the SickKids Research and Learning Institutes, The Hospital for Sick Children, Toronto, ON, Canada; Department of Paediatrics and The Wilson Centre, University of Toronto, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
74
|
Gong R, He S, Tian T, Chen J, Hao Y, Qiao C. FRCNN-AA-CIF: An automatic detection model of colon polyps based on attention awareness and context information fusion. Comput Biol Med 2023; 158:106787. [PMID: 37044051 DOI: 10.1016/j.compbiomed.2023.106787] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 03/03/2023] [Accepted: 03/11/2023] [Indexed: 04/08/2023]
Abstract
It is noted that the foreground and background of the polyp images detected under colonoscopy are not highly differentiated, and the feature map extracted by common deep learning object detection models keep getting smaller as the number of networks increases. Therefore, these models tend to ignore the details in pictures, resulting in a high polyp missed detection rate. To reduce the missed detection rate, this paper proposes an automatic detection model of colon polyps based on attention awareness and context information fusion (FRCNN-AA-CIF) based on a two-stage object detection model Faster Region-Convolutional Neural Network (FR-CNN). First, since the addition of attention awareness can make the feature extraction network pay more attention to polyp features, we propose an attention awareness module based on Squeeze-and-Excitation Network (SENet) and Efficient Channel Attention Module (ECA-Net) and add it after each block of the backbone network. Specifically, we first use the 1*1 convolution of ECA-Net to extract local cross-channel information and then use the two fully connected layers of SENet to reduce and increase the dimension, to filter out the channels that are more useful for feature learning. Further, because of the presence of air bubbles, impurities, inflammation, and accumulation of digestive matter around polyps, we used context information around polyps to enhance the focus on polyp features. In particular, after the network extracts the region of interest, we fuse the region of interest with its context information to improve the detection rate of polyps. The proposed model was tested on the colonoscopy dataset provided by Huashan Hospital. Numerical experiments show that FRCNN-AA-CIF has the highest detection accuracy (mAP of 0.817), the lowest missed detection rate of 4.22%, and the best classification effect (AUC of 95.98%). Its mAP increased by 3.3%, MDR decreased by 1.97%, and AUC increased by 1.8%. Compared with other object detection models, FRCNN-AA-CIF has significantly improved recognition accuracy and reduced missed detection rate.
Collapse
|
75
|
Cherubini A, Dinh NN. A Review of the Technology, Training, and Assessment Methods for the First Real-Time AI-Enhanced Medical Device for Endoscopy. Bioengineering (Basel) 2023; 10:404. [PMID: 37106592 PMCID: PMC10136070 DOI: 10.3390/bioengineering10040404] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 02/25/2023] [Accepted: 03/22/2023] [Indexed: 04/29/2023] Open
Abstract
Artificial intelligence (AI) has the potential to assist in endoscopy and improve decision making, particularly in situations where humans may make inconsistent judgments. The performance assessment of the medical devices operating in this context is a complex combination of bench tests, randomized controlled trials, and studies on the interaction between physicians and AI. We review the scientific evidence published about GI Genius, the first AI-powered medical device for colonoscopy to enter the market, and the device that is most widely tested by the scientific community. We provide an overview of its technical architecture, AI training and testing strategies, and regulatory path. In addition, we discuss the strengths and limitations of the current platform and its potential impact on clinical practice. The details of the algorithm architecture and the data that were used to train the AI device have been disclosed to the scientific community in the pursuit of a transparent AI. Overall, the first AI-enabled medical device for real-time video analysis represents a significant advancement in the use of AI for endoscopies and has the potential to improve the accuracy and efficiency of colonoscopy procedures.
Collapse
Affiliation(s)
- Andrea Cherubini
- Cosmo Intelligent Medical Devices, D02KV60 Dublin, Ireland
- Milan Center for Neuroscience, University of Milano–Bicocca, 20126 Milano, Italy
| | - Nhan Ngo Dinh
- Cosmo Intelligent Medical Devices, D02KV60 Dublin, Ireland
| |
Collapse
|
76
|
Liu Q, Han Z, Liu Z, Zhang J. HMA-Net: A deep U-shaped network combined with HarDNet and multi-attention mechanism for medical image segmentation. Med Phys 2023; 50:1635-1646. [PMID: 36303466 DOI: 10.1002/mp.16065] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Revised: 09/14/2022] [Accepted: 10/11/2021] [Indexed: 11/11/2022] Open
Abstract
BACKGROUND Automatic segmentation of lesion, organ, and tissue from the medical image is an important part of medical image analysis, which are useful for improving the accuracy of disease diagnosis and clinical analysis. For skin melanomas lesions, the contrast ratio between lesions and surrounding skin is low and there are many irregular shapes, uneven distribution, and local and boundary features. Moreover, some hair covering the lesions destroys the local context. Polyp characteristics such as shape, size, and appearance vary at different development stages. Early polyps with small sizes have no distinctive features and could be easily mistaken for other intestinal structures, such as wrinkles and folds. Imaging positions and illumination conditions would alter polyps' appearance and lead to no visible transitions between polyps and surrounding tissue. It remains a challenging task to accurately segment the skin lesions and polyps due to the high variability in the location, shape, size, color, and texture of the target object. Developing a robust and accurate segmentation method for medical images is necessary. PURPOSE To achieve better segmentation performance while dealing with the difficulties above, a U-shape network based on the encoder and decoder structure is proposed to enhance the segmentation performance in target regions. METHODS In this paper, a novel deep network of the encoder-decoder model that combines HarDNet, dual attention (DA), and reverse attention (RA) is proposed. First, HarDNet68 is employed to extract the backbone features while improving the inference speed and computational efficiency. Second, the DA block is adopted to capture the global feature dependency in spatial and channel dimensions, and enrich the contextual information on local features. At last, three RA blocks are exploited to fuse and refine the boundary features to obtain the final segmentation results. RESULTS Extensive experiments are conducted on a skin lesion dataset which consists of ISIC2016, ISIC2017, and ISIC 2018, and a polyp dataset which consists of several public datasets, that is, Kvasir, CVC-ClinicDB, CVC-ColonDB, ETIS, Endosece. The proposed method outperforms some state-of-art segmentation models on the ISIC2018, ISIC2017, and ISIC2016 datasets, with Jaccard's indexes of 0.846, 0.881, and 0.894, mean Dice coefficients of 0.907, 0.929, and 03939, precisions of 0.908, 0.977, and 0.968, and accuracies of 0.953, 0.975, and 0.972. Additionally, the proposed method also performs better than some state-of-art segmentation models on the Kvasir, CVC-ClinicDB, CVC-ColonDB, ETIS, and Endosece datasets, with mean Dice coefficients of 0.907, 0.935, 0.716, 0.667, and 0.887, mean intersection over union coefficients of 0.850, 0.885, 0.644, 0.595, and 0.821, structural similarity measures of 0.918, 0.953, 0.823, 0.807, and 0.933, enhanced alignment measures of 0.952, 0.983, 0.850, 0.817, and 0.957, mean absolute errors of 0.026, 0.007, 0.037, 0.030, and 0.009. CONCLUSIONS The proposed deep network could improve lesion segmentation performance in polyp and skin lesion images. The quantitative and qualitative results show that the proposed method can effectively handle the challenging task of segmentation while revealing the great potential for clinical application.
Collapse
Affiliation(s)
- Qiaohong Liu
- School of Medical Instruments, Shanghai University of Medicine and Health Sciences, Shanghai, China
| | - Ziqi Han
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, China
| | - Ziling Liu
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, China
| | - Juan Zhang
- School of Electronic and Electrical Engineering, Control Engineering, Shanghai University of Engineering Science, Shanghai, China
| |
Collapse
|
77
|
Wang K, Liu L, Fu X, Liu L, Peng W. RA-DENet: Reverse Attention and Distractions Elimination Network for polyp segmentation. Comput Biol Med 2023; 155:106704. [PMID: 36848801 DOI: 10.1016/j.compbiomed.2023.106704] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Revised: 02/01/2023] [Accepted: 02/19/2023] [Indexed: 02/27/2023]
Abstract
To address the problems of polyps of different shapes, sizes, and colors, low-contrast polyps, various noise distractions, and blurred edges on colonoscopy, we propose the Reverse Attention and Distraction Elimination Network, which includes Improved Reverse Attention, Distraction Elimination, and Feature Enhancement. First, we input the images in the polyp image set, and use the five levels polyp features and the global polyp feature extracted from the Res2Net-based backbone as the input of the Improved Reverse Attention to obtain augmented representations of salient and non-salient regions to capture the different shapes of polyp and distinguish low-contrast polyps from background. Then, the augmented representations of salient and non-salient areas are fed into the Distraction Elimination to obtain the refined polyp feature without false positive and false negative distractions for eliminating noises. Finally, the extracted low-level polyp feature is used as the input of the Feature Enhancement to obtain the edge feature for supplementing missing edge information of polyp. The polyp segmentation result is output by connecting the edge feature with the refined polyp feature. The proposed method is evaluated on five polyp datasets and compared with the current polyp segmentation models. Our model improves the mDice to 0.760 on the most challenge dataset (ETIS).
Collapse
Affiliation(s)
- Kaiqi Wang
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, China
| | - Li Liu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, China; Computer Technology Application Key Lab of Yunnan Province, Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China.
| | - Xiaodong Fu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, China; Computer Technology Application Key Lab of Yunnan Province, Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
| | - Lijun Liu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, China; Computer Technology Application Key Lab of Yunnan Province, Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
| | - Wei Peng
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, China; Computer Technology Application Key Lab of Yunnan Province, Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
| |
Collapse
|
78
|
DBE-Net: Dual Boundary-Guided Attention Exploration Network for Polyp Segmentation. Diagnostics (Basel) 2023; 13:diagnostics13050896. [PMID: 36900040 PMCID: PMC10001089 DOI: 10.3390/diagnostics13050896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Revised: 02/22/2023] [Accepted: 02/23/2023] [Indexed: 03/02/2023] Open
Abstract
Automatic segmentation of polyps during colonoscopy can help doctors accurately find the polyp area and remove abnormal tissues in time to reduce the possibility of polyps transforming into cancer. However, the current polyp segmentation research still has the following problems: blurry polyp boundaries, multi-scale adaptability of polyps, and close resemblances between polyps and nearby normal tissues. To tackle these issues, this paper proposes a dual boundary-guided attention exploration network (DBE-Net) for polyp segmentation. Firstly, we propose a dual boundary-guided attention exploration module to solve the boundary-blurring problem. This module uses a coarse-to-fine strategy to progressively approximate the real polyp boundary. Secondly, a multi-scale context aggregation enhancement module is introduced to accommodate the multi-scale variation of polyps. Finally, we propose a low-level detail enhancement module, which can extract more low-level details and promote the performance of the overall network. Extensive experiments on five polyp segmentation benchmark datasets show that our method achieves superior performance and stronger generalization ability than state-of-the-art methods. Especially for CVC-ColonDB and ETIS, two challenging datasets among the five datasets, our method achieves excellent results of 82.4% and 80.6% in terms of mDice (mean dice similarity coefficient) and improves by 5.1% and 5.9% compared to the state-of-the-art methods.
Collapse
|
79
|
Ali S, Jha D, Ghatwary N, Realdon S, Cannizzaro R, Salem OE, Lamarque D, Daul C, Riegler MA, Anonsen KV, Petlund A, Halvorsen P, Rittscher J, de Lange T, East JE. A multi-centre polyp detection and segmentation dataset for generalisability assessment. Sci Data 2023; 10:75. [PMID: 36746950 PMCID: PMC9902556 DOI: 10.1038/s41597-023-01981-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 01/23/2023] [Indexed: 02/08/2023] Open
Abstract
Polyps in the colon are widely known cancer precursors identified by colonoscopy. Whilst most polyps are benign, the polyp's number, size and surface structure are linked to the risk of colon cancer. Several methods have been developed to automate polyp detection and segmentation. However, the main issue is that they are not tested rigorously on a large multicentre purpose-built dataset, one reason being the lack of a comprehensive public dataset. As a result, the developed methods may not generalise to different population datasets. To this extent, we have curated a dataset from six unique centres incorporating more than 300 patients. The dataset includes both single frame and sequence data with 3762 annotated polyp labels with precise delineation of polyp boundaries verified by six senior gastroenterologists. To our knowledge, this is the most comprehensive detection and pixel-level segmentation dataset (referred to as PolypGen) curated by a team of computational scientists and expert gastroenterologists. The paper provides insight into data construction and annotation strategies, quality assurance, and technical validation.
Collapse
Affiliation(s)
- Sharib Ali
- School of Computing, University of Leeds, LS2 9JT, Leeds, United Kingdom.
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, OX3 7DQ, Oxford, United Kingdom.
- Oxford National Institute for Health Research Biomedical Research centre, OX4 2PG, Oxford, United Kingdom.
| | - Debesh Jha
- SimulaMet, Pilestredet 52, 0167, Oslo, Norway
- Department of Computer Science, UiT The Arctic University of Norway, Hansine Hansens veg 18, 9019, Tromsø, Norway
- Machine & Hybrid Intelligence Lab, Department of Radiology, Northwestern University, Chicago, USA
| | - Noha Ghatwary
- Computer Engineering Department, Arab Academy for Science and Technology,Smart Village, Giza, Egypt
| | - Stefano Realdon
- Oncological Gastroenterology - Centro di Riferimento Oncologico di Aviano (CRO), IRCCS, 2, 33081, Aviano, PN, Italy
| | - Renato Cannizzaro
- Oncological Gastroenterology - Centro di Riferimento Oncologico di Aviano (CRO), IRCCS, 2, 33081, Aviano, PN, Italy
- Department of Medical, Surgical and Health Sciences, University of Trieste, 34127, Trieste, Italy
| | - Osama E Salem
- Faculty of Medicine, University of Alexandria, 21131, Alexandria, Egypt
| | - Dominique Lamarque
- Université de Versailles St-Quentin en Yvelines, Hôpital Ambroise Paré, 9 Av. Charles de Gaulle, 92100, Boulogne-Billancourt, France
| | - Christian Daul
- CRAN UMR 7039, Université de Lorraine and CNRS, F-54010, Vandœuvre-Lès-Nancy, France
| | - Michael A Riegler
- SimulaMet, Pilestredet 52, 0167, Oslo, Norway
- Department of Computer Science, UiT The Arctic University of Norway, Hansine Hansens veg 18, 9019, Tromsø, Norway
| | - Kim V Anonsen
- Oslo University Hospital Ullevål, Kirkeveien 166, 0450, Oslo, Norway
| | | | - Pål Halvorsen
- SimulaMet, Pilestredet 52, 0167, Oslo, Norway
- Oslo Metropolitan University, Pilestredet 46, 0167, Oslo, Norway
| | - Jens Rittscher
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, OX3 7DQ, Oxford, United Kingdom
- Oxford National Institute for Health Research Biomedical Research centre, OX4 2PG, Oxford, United Kingdom
| | - Thomas de Lange
- Augere Medical, Nedre Vaskegang 6, 0186, Oslo, Norway
- Medical Department, Sahlgrenska University Hospital-Mölndal, Blå stråket 5, 413 45, Göteborg, Sweden
- Department of Molecular and Clinical Medicine, Sahlgrenska Academy, University of Gothenburg, 41345, Göteborg, Sweden
| | - James E East
- Oxford National Institute for Health Research Biomedical Research centre, OX4 2PG, Oxford, United Kingdom
- Translational Gastroenterology Unit, Experimental Medicine Div., John Radcliffe Hospital, University of Oxford, OX3 9DU, Oxford, United Kingdom
| |
Collapse
|
80
|
Zheng J, Liu H, Feng Y, Xu J, Zhao L. CASF-Net: Cross-attention and cross-scale fusion network for medical image segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 229:107307. [PMID: 36571889 DOI: 10.1016/j.cmpb.2022.107307] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 11/22/2022] [Accepted: 12/09/2022] [Indexed: 06/18/2023]
Abstract
BACKGROUND Automatic segmentation of medical images has progressed greatly owing to the development of convolutional neural networks (CNNs). However, there are two uncertainties with current approaches based on convolutional operations: (1) how to eliminate the general limitations that CNNs lack the ability of modeling long-range dependencies and global contextual interactions, and (2) how to efficiently discover and integrate global and local features that are implied in the image. Notably, these two problems are interconnected, yet previous approaches mainly focus on the first problem and ignore the importance of information integration. METHODS In this paper, we propose a novel cross-attention and cross-scale fusion network (CASF-Net), which aims to explicitly tap the potential of dual-branch networks and fully integrate the coarse and fine-grained feature representations. Specifically, the well-designed dual-branch encoder hammers at modeling non-local dependencies and multi-scale contexts, significantly improving the quality of semantic segmentation. Moreover, the proposed cross-attention and cross-scale module efficiently perform multi-scale information fusion, being capable of further exploring the long-range contextual information. RESULTS Extensive experiments conducted on three different types of medical image segmentation tasks demonstrate the state-of-the-art performance of our proposed method both visually and numerically. CONCLUSIONS This paper assembles the feature representation capabilities of CNN and transformer and proposes cross-attention and cross-scale fusion algorithms. The promising results show new possibilities of using cross-fusion mechanisms in more downstream medical image tasks.
Collapse
Affiliation(s)
- Jianwei Zheng
- College of Computer Science and Engineering, Zhejiang University of Technology, Hangzhou 310014, China.
| | - Hao Liu
- College of Computer Science and Engineering, Zhejiang University of Technology, Hangzhou 310014, China
| | - Yuchao Feng
- College of Computer Science and Engineering, Zhejiang University of Technology, Hangzhou 310014, China
| | - Jinshan Xu
- College of Computer Science and Engineering, Zhejiang University of Technology, Hangzhou 310014, China
| | - Liang Zhao
- Stomatological Hospital of Xiamen Medical College and the Xiamen Key Laboratory of Stomatological Disease Diagnosis and Treatment, Xiamen 361000, China.
| |
Collapse
|
81
|
Meng Y, Zhang H, Zhao Y, Gao D, Hamill B, Patri G, Peto T, Madhusudhan S, Zheng Y. Dual Consistency Enabled Weakly and Semi-Supervised Optic Disc and Cup Segmentation With Dual Adaptive Graph Convolutional Networks. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:416-429. [PMID: 36044486 DOI: 10.1109/tmi.2022.3203318] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Glaucoma is a progressive eye disease that results in permanent vision loss, and the vertical cup to disc ratio (vCDR) in colour fundus images is essential in glaucoma screening and assessment. Previous fully supervised convolution neural networks segment the optic disc (OD) and optic cup (OC) from color fundus images and then calculate the vCDR offline. However, they rely on a large set of labeled masks for training, which is expensive and time-consuming to acquire. To address this, we propose a weakly and semi-supervised graph-based network that investigates geometric associations and domain knowledge between segmentation probability maps (PM), modified signed distance function representations (mSDF), and boundary region of interest characteristics (B-ROI) in three aspects. Firstly, we propose a novel Dual Adaptive Graph Convolutional Network (DAGCN) to reason the long-range features of the PM and the mSDF w.r.t. the regional uniformity. Secondly, we propose a dual consistency regularization-based semi-supervised learning paradigm. The regional consistency between the PM and the mSDF, and the marginal consistency between the derived B-ROI from each of them boost the proposed model's performance due to the inherent geometric associations. Thirdly, we exploit the task-specific domain knowledge via the oval shapes of OD & OC, where a differentiable vCDR estimating layer is proposed. Furthermore, without additional annotations, the supervision on vCDR serves as weakly-supervisions for segmentation tasks. Experiments on six large-scale datasets demonstrate our model's superior performance on OD & OC segmentation and vCDR estimation. The implementation code has been made available.https://github.com/smallmax00/Dual_Adaptive_Graph_Reasoning.
Collapse
|
82
|
Houwen BBSL, Nass KJ, Vleugels JLA, Fockens P, Hazewinkel Y, Dekker E. Comprehensive review of publicly available colonoscopic imaging databases for artificial intelligence research: availability, accessibility, and usability. Gastrointest Endosc 2023; 97:184-199.e16. [PMID: 36084720 DOI: 10.1016/j.gie.2022.08.043] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 08/24/2022] [Accepted: 08/30/2022] [Indexed: 01/28/2023]
Abstract
BACKGROUND AND AIMS Publicly available databases containing colonoscopic imaging data are valuable resources for artificial intelligence (AI) research. Currently, little is known regarding the available number and content of these databases. This review aimed to describe the availability, accessibility, and usability of publicly available colonoscopic imaging databases, focusing on polyp detection, polyp characterization, and quality of colonoscopy. METHODS A systematic literature search was performed in MEDLINE and Embase to identify AI studies describing publicly available colonoscopic imaging databases published after 2010. Second, a targeted search using Google's Dataset Search, Google Search, GitHub, and Figshare was done to identify databases directly. Databases were included if they contained data about polyp detection, polyp characterization, or quality of colonoscopy. To assess accessibility of databases, the following categories were defined: open access, open access with barriers, and regulated access. To assess the potential usability of the included databases, essential details of each database were extracted using a checklist derived from the Checklist for Artificial Intelligence in Medical Imaging. RESULTS We identified 22 databases with open access, 3 databases with open access with barriers, and 15 databases with regulated access. The 22 open access databases contained 19,463 images and 952 videos. Nineteen of these databases focused on polyp detection, localization, and/or segmentation; 6 on polyp characterization, and 3 on quality of colonoscopy. Only half of these databases have been used by other researcher to develop, train, or benchmark their AI system. Although technical details were in general well reported, important details such as polyp and patient demographics and the annotation process were under-reported in almost all databases. CONCLUSIONS This review provides greater insight on public availability of colonoscopic imaging databases for AI research. Incomplete reporting of important details limits the ability of researchers to assess the usability of current databases.
Collapse
Affiliation(s)
- Britt B S L Houwen
- Department of Gastroenterology and Hepatology, Amsterdam Gastroenterology Endocrinology Metabolism, Amsterdam University Medical Centres, location Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands
| | - Karlijn J Nass
- Department of Gastroenterology and Hepatology, Amsterdam Gastroenterology Endocrinology Metabolism, Amsterdam University Medical Centres, location Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands
| | - Jasper L A Vleugels
- Department of Gastroenterology and Hepatology, Amsterdam Gastroenterology Endocrinology Metabolism, Amsterdam University Medical Centres, location Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands
| | - Paul Fockens
- Department of Gastroenterology and Hepatology, Amsterdam Gastroenterology Endocrinology Metabolism, Amsterdam University Medical Centres, location Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands
| | - Yark Hazewinkel
- Department of Gastroenterology and Hepatology, Radboud University Nijmegen Medical Center, Radboud University of Nijmegen, Nijmegen, the Netherlands
| | - Evelien Dekker
- Department of Gastroenterology and Hepatology, Amsterdam Gastroenterology Endocrinology Metabolism, Amsterdam University Medical Centres, location Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands
| |
Collapse
|
83
|
Big Data in Gastroenterology Research. Int J Mol Sci 2023; 24:ijms24032458. [PMID: 36768780 PMCID: PMC9916510 DOI: 10.3390/ijms24032458] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 01/18/2023] [Accepted: 01/20/2023] [Indexed: 01/28/2023] Open
Abstract
Studying individual data types in isolation provides only limited and incomplete answers to complex biological questions and particularly falls short in revealing sufficient mechanistic and kinetic details. In contrast, multi-omics approaches to studying health and disease permit the generation and integration of multiple data types on a much larger scale, offering a comprehensive picture of biological and disease processes. Gastroenterology and hepatobiliary research are particularly well-suited to such analyses, given the unique position of the luminal gastrointestinal (GI) tract at the nexus between the gut (mucosa and luminal contents), brain, immune and endocrine systems, and GI microbiome. The generation of 'big data' from multi-omic, multi-site studies can enhance investigations into the connections between these organ systems and organisms and more broadly and accurately appraise the effects of dietary, pharmacological, and other therapeutic interventions. In this review, we describe a variety of useful omics approaches and how they can be integrated to provide a holistic depiction of the human and microbial genetic and proteomic changes underlying physiological and pathophysiological phenomena. We highlight the potential pitfalls and alternatives to help avoid the common errors in study design, execution, and analysis. We focus on the application, integration, and analysis of big data in gastroenterology and hepatobiliary research.
Collapse
|
84
|
Nachmani R, Nidal I, Robinson D, Yassin M, Abookasis D. Segmentation of polyps based on pyramid vision transformers and residual block for real-time endoscopy imaging. J Pathol Inform 2023; 14:100197. [PMID: 36844703 PMCID: PMC9945716 DOI: 10.1016/j.jpi.2023.100197] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 01/22/2023] [Accepted: 01/22/2023] [Indexed: 01/27/2023] Open
Abstract
Polyp segmentation is an important task in early identification of colon polyps for prevention of colorectal cancer. Numerous methods of machine learning have been utilized in an attempt to solve this task with varying levels of success. A successful polyp segmentation method which is both accurate and fast could make a huge impact on colonoscopy exams, aiding in real-time detection, as well as enabling faster and cheaper offline analysis. Thus, recent studies have worked to produce networks that are more accurate and faster than the previous generation of networks (e.g., NanoNet). Here, we propose ResPVT architecture for polyp segmentation. This platform uses transformers as a backbone and far surpasses all previous networks not only in accuracy but also with a much higher frame rate which may drastically reduce costs in both real time and offline analysis and enable the widespread application of this technology.
Collapse
Affiliation(s)
- Roi Nachmani
- Department of Electrical and Electronics Engineering, Ariel University, Ariel 407000, Israel
| | - Issa Nidal
- Department of Surgery, Hasharon Hospital, Rabin Medical Center, affiliated with Tel Aviv, University School of Medicine, Petah Tikva, Israel
| | - Dror Robinson
- Department of Orthopedics, Hasharon Hospital, Rabin Medical Center, affiliated with Tel Aviv, University School of Medicine, Petah Tikva, Israel
| | - Mustafa Yassin
- Department of Orthopedics, Hasharon Hospital, Rabin Medical Center, affiliated with Tel Aviv, University School of Medicine, Petah Tikva, Israel
| | - David Abookasis
- Department of Electrical and Electronics Engineering, Ariel University, Ariel 407000, Israel
- Ariel Photonics Center, Ariel University, Ariel 407000, Israel
- Corresponding author.
| |
Collapse
|
85
|
Krenzer A, Banck M, Makowski K, Hekalo A, Fitting D, Troya J, Sudarevic B, Zoller WG, Hann A, Puppe F. A Real-Time Polyp-Detection System with Clinical Application in Colonoscopy Using Deep Convolutional Neural Networks. J Imaging 2023; 9:jimaging9020026. [PMID: 36826945 PMCID: PMC9967208 DOI: 10.3390/jimaging9020026] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 01/18/2023] [Accepted: 01/19/2023] [Indexed: 01/26/2023] Open
Abstract
Colorectal cancer (CRC) is a leading cause of cancer-related deaths worldwide. The best method to prevent CRC is with a colonoscopy. During this procedure, the gastroenterologist searches for polyps. However, there is a potential risk of polyps being missed by the gastroenterologist. Automated detection of polyps helps to assist the gastroenterologist during a colonoscopy. There are already publications examining the problem of polyp detection in the literature. Nevertheless, most of these systems are only used in the research context and are not implemented for clinical application. Therefore, we introduce the first fully open-source automated polyp-detection system scoring best on current benchmark data and implementing it ready for clinical application. To create the polyp-detection system (ENDOMIND-Advanced), we combined our own collected data from different hospitals and practices in Germany with open-source datasets to create a dataset with over 500,000 annotated images. ENDOMIND-Advanced leverages a post-processing technique based on video detection to work in real-time with a stream of images. It is integrated into a prototype ready for application in clinical interventions. We achieve better performance compared to the best system in the literature and score a F1-score of 90.24% on the open-source CVC-VideoClinicDB benchmark.
Collapse
Affiliation(s)
- Adrian Krenzer
- Department of Artificial Intelligence and Knowledge Systems, Julius-Maximilians University of Würzburg, Sanderring 2, 97070 Würzburg, Germany
- Interventional and Experimental Endoscopy (InExEn), Department of Internal Medicine II, University Hospital Würzburg, Oberdürrbacher Straße 6, 97080 Würzburg, Germany
| | - Michael Banck
- Department of Artificial Intelligence and Knowledge Systems, Julius-Maximilians University of Würzburg, Sanderring 2, 97070 Würzburg, Germany
- Interventional and Experimental Endoscopy (InExEn), Department of Internal Medicine II, University Hospital Würzburg, Oberdürrbacher Straße 6, 97080 Würzburg, Germany
| | - Kevin Makowski
- Department of Artificial Intelligence and Knowledge Systems, Julius-Maximilians University of Würzburg, Sanderring 2, 97070 Würzburg, Germany
| | - Amar Hekalo
- Department of Artificial Intelligence and Knowledge Systems, Julius-Maximilians University of Würzburg, Sanderring 2, 97070 Würzburg, Germany
| | - Daniel Fitting
- Interventional and Experimental Endoscopy (InExEn), Department of Internal Medicine II, University Hospital Würzburg, Oberdürrbacher Straße 6, 97080 Würzburg, Germany
| | - Joel Troya
- Interventional and Experimental Endoscopy (InExEn), Department of Internal Medicine II, University Hospital Würzburg, Oberdürrbacher Straße 6, 97080 Würzburg, Germany
| | - Boban Sudarevic
- Interventional and Experimental Endoscopy (InExEn), Department of Internal Medicine II, University Hospital Würzburg, Oberdürrbacher Straße 6, 97080 Würzburg, Germany
- Department of Internal Medicine and Gastroenterology, Katharinenhospital, Kriegsbergstrasse 60, 70174 Stuttgart, Germany
| | - Wolfgang G Zoller
- Interventional and Experimental Endoscopy (InExEn), Department of Internal Medicine II, University Hospital Würzburg, Oberdürrbacher Straße 6, 97080 Würzburg, Germany
- Department of Internal Medicine and Gastroenterology, Katharinenhospital, Kriegsbergstrasse 60, 70174 Stuttgart, Germany
| | - Alexander Hann
- Interventional and Experimental Endoscopy (InExEn), Department of Internal Medicine II, University Hospital Würzburg, Oberdürrbacher Straße 6, 97080 Würzburg, Germany
| | - Frank Puppe
- Department of Artificial Intelligence and Knowledge Systems, Julius-Maximilians University of Würzburg, Sanderring 2, 97070 Würzburg, Germany
| |
Collapse
|
86
|
Lewis J, Cha YJ, Kim J. Dual encoder-decoder-based deep polyp segmentation network for colonoscopy images. Sci Rep 2023; 13:1183. [PMID: 36681776 PMCID: PMC9867760 DOI: 10.1038/s41598-023-28530-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 01/19/2023] [Indexed: 01/22/2023] Open
Abstract
Detection of colorectal polyps through colonoscopy is an essential practice in prevention of colorectal cancers. However, the method itself is labor intensive and is subject to human error. With the advent of deep learning-based methodologies, and specifically convolutional neural networks, an opportunity to improve upon the prognosis of potential patients suffering with colorectal cancer has appeared with automated detection and segmentation of polyps. Polyp segmentation is subject to a number of problems such as model overfitting and generalization, poor definition of boundary pixels, as well as the model's ability to capture the practical range in textures, sizes, and colors. In an effort to address these challenges, we propose a dual encoder-decoder solution named Polyp Segmentation Network (PSNet). Both the dual encoder and decoder were developed by the comprehensive combination of a variety of deep learning modules, including the PS encoder, transformer encoder, PS decoder, enhanced dilated transformer decoder, partial decoder, and merge module. PSNet outperforms state-of-the-art results through an extensive comparative study against 5 existing polyp datasets with respect to both mDice and mIoU at 0.863 and 0.797, respectively. With our new modified polyp dataset we obtain an mDice and mIoU of 0.941 and 0.897 respectively.
Collapse
Affiliation(s)
- John Lewis
- Department of Civil Engineering, University of Manitoba, Winnipeg, R3M 0N2, Canada
| | - Young-Jin Cha
- Department of Civil Engineering, University of Manitoba, Winnipeg, R3M 0N2, Canada.
| | - Jongho Kim
- Department of Radiology, Max Rady College of Medicine, University of Manitoba, Winnipeg, R3A 1R9, Canada
| |
Collapse
|
87
|
ELKarazle K, Raman V, Then P, Chua C. Detection of Colorectal Polyps from Colonoscopy Using Machine Learning: A Survey on Modern Techniques. SENSORS (BASEL, SWITZERLAND) 2023; 23:1225. [PMID: 36772263 PMCID: PMC9953705 DOI: 10.3390/s23031225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 01/08/2023] [Accepted: 01/17/2023] [Indexed: 06/18/2023]
Abstract
Given the increased interest in utilizing artificial intelligence as an assistive tool in the medical sector, colorectal polyp detection and classification using deep learning techniques has been an active area of research in recent years. The motivation for researching this topic is that physicians miss polyps from time to time due to fatigue and lack of experience carrying out the procedure. Unidentified polyps can cause further complications and ultimately lead to colorectal cancer (CRC), one of the leading causes of cancer mortality. Although various techniques have been presented recently, several key issues, such as the lack of enough training data, white light reflection, and blur affect the performance of such methods. This paper presents a survey on recently proposed methods for detecting polyps from colonoscopy. The survey covers benchmark dataset analysis, evaluation metrics, common challenges, standard methods of building polyp detectors and a review of the latest work in the literature. We conclude this paper by providing a precise analysis of the gaps and trends discovered in the reviewed literature for future work.
Collapse
Affiliation(s)
- Khaled ELKarazle
- School of Information and Communication Technologies, Swinburne University of Technology, Sarawak Campus, Kuching 93350, Malaysia
| | - Valliappan Raman
- Department of Artificial Intelligence and Data Science, Coimbatore Institute of Technology, Coimbatore 641014, India
| | - Patrick Then
- School of Information and Communication Technologies, Swinburne University of Technology, Sarawak Campus, Kuching 93350, Malaysia
| | - Caslon Chua
- Department of Computer Science and Software Engineering, Swinburne University of Technology, Melbourne 3122, Australia
| |
Collapse
|
88
|
Shen T, Li X. Automatic polyp image segmentation and cancer prediction based on deep learning. Front Oncol 2023; 12:1087438. [PMID: 36713495 PMCID: PMC9878560 DOI: 10.3389/fonc.2022.1087438] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 12/22/2022] [Indexed: 01/15/2023] Open
Abstract
The similar shape and texture of colonic polyps and normal mucosal tissues lead to low accuracy of medical image segmentation algorithms. To solve these problems, we proposed a polyp image segmentation algorithm based on deep learning technology, which combines a HarDNet module, attention module, and multi-scale coding module with the U-Net network as the basic framework, including two stages of coding and decoding. In the encoder stage, HarDNet68 is used as the main backbone network to extract features using four null space convolutional pooling pyramids while improving the inference speed and computational efficiency; the attention mechanism module is added to the encoding and decoding network; then the model can learn the global and local feature information of the polyp image, thus having the ability to process information in both spatial and channel dimensions, to solve the problem of information loss in the encoding stage of the network and improving the performance of the segmentation network. Through comparative analysis with other algorithms, we can find that the network of this paper has a certain degree of improvement in segmentation accuracy and operation speed, which can effectively assist physicians in removing abnormal colorectal tissues and thus reduce the probability of polyp cancer, and improve the survival rate and quality of life of patients. Also, it has good generalization ability, which can provide technical support and prevention for colon cancer.
Collapse
Affiliation(s)
- Tongping Shen
- School of Information Engineering, Anhui University of Chinese Medicine, Hefei, China,Graduate School, Angeles University Foundation, Angeles, Philippines,*Correspondence: Tongping Shen,
| | - Xueguang Li
- School of Computer Science and Technology, Henan Institute of Technology, Xinxiang, China
| |
Collapse
|
89
|
He X. A multi-resolution unet algorithm based on data augmentation and multi-center training for polyp automatic segmentation. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-223340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
In clinical practice, segmenting polyps from colonoscopy images plays an important role in the diagnosis and treatment of colorectal cancer since it provides valuable information. However, accurate polyp segmentation is full of changes due to the following reasons: (1) the small training datasets with a limited number of samples and the lack of data variability; (2) the same type of polyps with a variation in texture, size, and color; (3) the weak boundary between a polyp and its surrounding mucosa. To address these challenges, we propose a novel robust deep neural network based on data augmentation, called Robust Multi-center Multi-resolution Unet (RMMSUNet), for the polyp segmentation task. Data augmentation and Multi-center training are both utilized to increase the amount and diversity of training dataset. The new multi-resolution blocks make up for the lack of fine-grained information in U-Net, and ensures the generation of more accurate pixel-level segmentation prediction graphs. Region-based refinement is added as the post-processing for the network output, to correct some wrongly predicted pixels and further refine the segmentation results. Quantitative and qualitative evaluations on the challenging polyp dataset show that our RMMSUNet improves the segmentation accuracy significantly, when comparing to other SOTA algorithms.
Collapse
Affiliation(s)
- Xiaoxu He
- School of Guangdong & Taiwan Artificial Intelligence, Foshan University, Foshan, China
| |
Collapse
|
90
|
APT-Net: Adaptive encoding and parallel decoding transformer for medical image segmentation. Comput Biol Med 2022; 151:106292. [PMID: 36399856 DOI: 10.1016/j.compbiomed.2022.106292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2022] [Revised: 10/30/2022] [Accepted: 11/06/2022] [Indexed: 11/13/2022]
Abstract
There are limitations in the study of transformer-based medical image segmentation networks for token position encoding and decoding of images. The position encoding module cannot encode the position information adequately, and the serial decoder cannot utilize the contextual information efficiently. In this paper, we propose a new CNN-transformer hybrid structure for the medical image segmentation network APT-Net based on the encoder-decoder architecture. The network introduces an adaptive position encoding module for the fusion of position information of a multi-receptive field to provide more adequate position information for the token sequences in the transformer. In addition, the dual-path parallel decoder's basic and guide information paths simultaneously process multiscale feature maps to efficiently utilize contextual information. We conducted extensive experiments and reported a number of important metrics from multiple perspectives on seven datasets containing skin lesions, polyps, and glands. The IoU reached 0.783 and 0.851 on the ISIC2017 and Glas datasets, respectively. To the best of our knowledge, APT-Net achieves state-of-the-art performance on the Glas dataset and polyp segmentation tasks. Ablation experiments validate the effectiveness of the proposed adaptive position encoding module and the dual-path parallel decoder. Comparative experiments with state-of-the-art methods demonstrate the high accuracy and portability of APT-Net.
Collapse
|
91
|
Wu C, Long C, Li S, Yang J, Jiang F, Zhou R. MSRAformer: Multiscale spatial reverse attention network for polyp segmentation. Comput Biol Med 2022; 151:106274. [PMID: 36375412 DOI: 10.1016/j.compbiomed.2022.106274] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Revised: 10/10/2022] [Accepted: 10/30/2022] [Indexed: 11/11/2022]
Abstract
Colon polyp is an important reference basis in the diagnosis of colorectal cancer(CRC). In routine diagnosis, the polyp area is segmented from the colorectal enteroscopy image, and the obtained pathological information is used to assist in the diagnosis of the disease and surgery. It is always a challenging task for accurate segmentation of polyps in colonoscopy images. There are great differences in shape, size, color and texture of the same type of polyps, and it is difficult to distinguish the polyp region from the mucosal boundary. In recent years, convolutional neural network(CNN) has achieved some results in the task of medical image segmentation. However, CNNs focus on the extraction of local features and be short of the extracting ability of global feature information. This paper presents a Multiscale Spatial Reverse Attention Network called MSRAformer with high performance in medical segmentation, which adopts the Swin Transformer encoder with pyramid structure to extract the features of four different stages, and extracts the multi-scale feature information through the multi-scale channel attention module, which enhances the global feature extraction ability and generalization of the network, and preliminarily aggregates a pre-segmentation result. This paper proposes a spatial reverse attention mechanism module to gradually supplement the edge structure and detail information of the polyp region. Extensive experiments on MSRAformer proved that the segmentation effect on the colonoscopy polyp dataset is better than most state-of-the-art(SOTA) medical image segmentation methods, with better generalization performance. Reference implementation of MSRAformer is available at https://github.com/ChengLong1222/MSRAformer-main.
Collapse
Affiliation(s)
- Cong Wu
- School of computer science, Hubei University of Technology, Wuhan, China.
| | - Cheng Long
- School of computer science, Hubei University of Technology, Wuhan, China.
| | - Shijun Li
- School of computer science, Hubei University of Technology, Wuhan, China
| | - Junjie Yang
- Union Hospital Tongji Medical College Huazhong University of Science and Technology, Wuhan, China
| | - Fagang Jiang
- Union Hospital Tongji Medical College Huazhong University of Science and Technology, Wuhan, China
| | - Ran Zhou
- School of computer science, Hubei University of Technology, Wuhan, China
| |
Collapse
|
92
|
Sun Q, Dai M, Lan Z, Cai F, Wei L, Yang C, Chen R. UCR-Net: U-shaped context residual network for medical image segmentation. Comput Biol Med 2022; 151:106203. [PMID: 36306581 DOI: 10.1016/j.compbiomed.2022.106203] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 09/04/2022] [Accepted: 10/09/2022] [Indexed: 12/27/2022]
Abstract
Medical image segmentation prerequisite for numerous clinical needs is a critical step in biomedical image analysis. The U-Net framework is one of the most popular deep networks in this field. However, U-Net's successive pooling and downsampling operations result in some loss of spatial information. In this paper, we propose a U-shaped context residual network, called UCR-Net, to capture more context and high-level information for medical image segmentation. The proposed UCR-Net is an encoder-decoder framework comprising a feature encoder module and a feature decoder module. The feature decoder module contains four newly proposed context attention exploration(CAE) modules, a newly proposed global and spatial attention (GSA) module, and four decoder blocks. We use the proposed CAE module to capture more multi-scale context features from the encoder. The proposed GSA module further explores global context features and semantically enhanced deep-level features. The proposed UCR-Net can recover more high-level semantic features and fuse context attention information from CAE and global and spatial attention information from GSA module. Experiments on the retinal vessel, femoropopliteal artery stent, and polyp datasets demonstrate that the proposed UCR-Net performs favorably against the original U-Net and other advanced methods.
Collapse
Affiliation(s)
- Qi Sun
- Digital Fujian Research Institute of Big Data for Agriculture and Forestry, College of Computer and Information Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
| | - Mengyun Dai
- Digital Fujian Research Institute of Big Data for Agriculture and Forestry, College of Computer and Information Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
| | - Ziyang Lan
- Digital Fujian Research Institute of Big Data for Agriculture and Forestry, College of Computer and Information Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
| | - Fanggang Cai
- Department of vascular surgery, the First Affiliated Hospital, Fujian Medical University, Fuzhou 350108, China.
| | - Lifang Wei
- Digital Fujian Research Institute of Big Data for Agriculture and Forestry, College of Computer and Information Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
| | - Changcai Yang
- Digital Fujian Research Institute of Big Data for Agriculture and Forestry, College of Computer and Information Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
| | - Riqing Chen
- Digital Fujian Research Institute of Big Data for Agriculture and Forestry, College of Computer and Information Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
| |
Collapse
|
93
|
Zhou T, Zhou Y, Gong C, Yang J, Zhang Y. Feature Aggregation and Propagation Network for Camouflaged Object Detection. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:7036-7047. [PMID: 36331642 DOI: 10.1109/tip.2022.3217695] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Camouflaged object detection (COD) aims to detect/segment camouflaged objects embedded in the environment, which has attracted increasing attention over the past decades. Although several COD methods have been developed, they still suffer from unsatisfactory performance due to the intrinsic similarities between the foreground objects and background surroundings. In this paper, we propose a novel Feature Aggregation and Propagation Network (FAP-Net) for camouflaged object detection. Specifically, we propose a Boundary Guidance Module (BGM) to explicitly model the boundary characteristic, which can provide boundary-enhanced features to boost the COD performance. To capture the scale variations of the camouflaged objects, we propose a Multi-scale Feature Aggregation Module (MFAM) to characterize the multi-scale information from each layer and obtain the aggregated feature representations. Furthermore, we propose a Cross-level Fusion and Propagation Module (CFPM). In the CFPM, the feature fusion part can effectively integrate the features from adjacent layers to exploit the cross-level correlations, and the feature propagation part can transmit valuable context information from the encoder to the decoder network via a gate unit. Finally, we formulate a unified and end-to-end trainable framework where cross-level features can be effectively fused and propagated for capturing rich context information. Extensive experiments on three benchmark camouflaged datasets demonstrate that our FAP-Net outperforms other state-of-the-art COD models. Moreover, our model can be extended to the polyp segmentation task, and the comparison results further validate the effectiveness of the proposed model in segmenting polyps. The source code and results will be released at https://github.com/taozh2017/FAPNet.
Collapse
|
94
|
Parkash O, Siddiqui ATS, Jiwani U, Rind F, Padhani ZA, Rizvi A, Hoodbhoy Z, Das JK. Diagnostic accuracy of artificial intelligence for detecting gastrointestinal luminal pathologies: A systematic review and meta-analysis. Front Med (Lausanne) 2022; 9:1018937. [PMID: 36405592 PMCID: PMC9672666 DOI: 10.3389/fmed.2022.1018937] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 10/03/2022] [Indexed: 11/06/2022] Open
Abstract
Background Artificial Intelligence (AI) holds considerable promise for diagnostics in the field of gastroenterology. This systematic review and meta-analysis aims to assess the diagnostic accuracy of AI models compared with the gold standard of experts and histopathology for the diagnosis of various gastrointestinal (GI) luminal pathologies including polyps, neoplasms, and inflammatory bowel disease. Methods We searched PubMed, CINAHL, Wiley Cochrane Library, and Web of Science electronic databases to identify studies assessing the diagnostic performance of AI models for GI luminal pathologies. We extracted binary diagnostic accuracy data and constructed contingency tables to derive the outcomes of interest: sensitivity and specificity. We performed a meta-analysis and hierarchical summary receiver operating characteristic curves (HSROC). The risk of bias was assessed using Quality Assessment for Diagnostic Accuracy Studies-2 (QUADAS-2) tool. Subgroup analyses were conducted based on the type of GI luminal disease, AI model, reference standard, and type of data used for analysis. This study is registered with PROSPERO (CRD42021288360). Findings We included 73 studies, of which 31 were externally validated and provided sufficient information for inclusion in the meta-analysis. The overall sensitivity of AI for detecting GI luminal pathologies was 91.9% (95% CI: 89.0–94.1) and specificity was 91.7% (95% CI: 87.4–94.7). Deep learning models (sensitivity: 89.8%, specificity: 91.9%) and ensemble methods (sensitivity: 95.4%, specificity: 90.9%) were the most commonly used models in the included studies. Majority of studies (n = 56, 76.7%) had a high risk of selection bias while 74% (n = 54) studies were low risk on reference standard and 67% (n = 49) were low risk for flow and timing bias. Interpretation The review suggests high sensitivity and specificity of AI models for the detection of GI luminal pathologies. There is a need for large, multi-center trials in both high income countries and low- and middle- income countries to assess the performance of these AI models in real clinical settings and its impact on diagnosis and prognosis. Systematic review registration [https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=288360], identifier [CRD42021288360].
Collapse
Affiliation(s)
- Om Parkash
- Department of Medicine, Aga Khan University, Karachi, Pakistan
| | | | - Uswa Jiwani
- Center of Excellence in Women and Child Health, Aga Khan University, Karachi, Pakistan
| | - Fahad Rind
- Head and Neck Oncology, The Ohio State University, Columbus, OH, United States
| | - Zahra Ali Padhani
- Institute for Global Health and Development, Aga Khan University, Karachi, Pakistan
| | - Arjumand Rizvi
- Center of Excellence in Women and Child Health, Aga Khan University, Karachi, Pakistan
| | - Zahra Hoodbhoy
- Department of Pediatrics and Child Health, Aga Khan University, Karachi, Pakistan
| | - Jai K. Das
- Institute for Global Health and Development, Aga Khan University, Karachi, Pakistan
- Department of Pediatrics and Child Health, Aga Khan University, Karachi, Pakistan
- *Correspondence: Jai K. Das,
| |
Collapse
|
95
|
Zhang W, Fu C, Zheng Y, Zhang F, Zhao Y, Sham CW. HSNet: A hybrid semantic network for polyp segmentation. Comput Biol Med 2022; 150:106173. [PMID: 36257278 DOI: 10.1016/j.compbiomed.2022.106173] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 09/18/2022] [Accepted: 10/01/2022] [Indexed: 11/29/2022]
Abstract
Automatic polyp segmentation can help physicians to effectively locate polyps (a.k.a. region of interests) in clinical practice, in the way of screening colonoscopy images assisted by neural networks (NN). However, two significant bottlenecks hinder its effectiveness, disappointing physicians' expectations. (1) Changeable polyps in different scaling, orientation, and illumination, bring difficulty in accurate segmentation. (2) Current works building on a dominant decoder-encoder network tend to overlook appearance details (e.g., textures) for a tiny polyp, degrading the accuracy to differentiate polyps. For alleviating the bottlenecks, we investigate a hybrid semantic network (HSNet) that adopts both advantages of Transformer and convolutional neural networks (CNN), aiming at improving polyp segmentation. Our HSNet contains a cross-semantic attention module (CSA), a hybrid semantic complementary module (HSC), and a multi-scale prediction module (MSP). Unlike previous works on segmenting polyps, we newly insert the CSA module, which can fill the gap between low-level and high-level features via an interactive mechanism that exchanges two types of semantics from different NN attentions. By a dual-branch structure of Transformer and CNN, we newly design an HSC module, for capturing both long-range dependencies and local details of appearance. Besides, the MSP module can learn weights for fusing stage-level prediction masks of a decoder. Experimentally, we compared our work with 10 state-of-the-art works, including both recent and classical works, showing improved accuracy (via 7 evaluative metrics) over 5 benchmark datasets, e.g., it achieves 0.926/0.877 mDic/mIoU on Kvasir-SEG, 0.948/0.905 mDic/mIoU on ClinicDB, 0.810/0.735 mDic/mIoU on ColonDB, 0.808/0.74 mDic/mIoU on ETIS, and 0.903/0.839 mDic/mIoU on Endoscene. The proposed model is available at (https://github.com/baiboat/HSNet).
Collapse
Affiliation(s)
- Wenchao Zhang
- School of Computer Science and Engineering, Northeastern University, Shenyang 110819, China.
| | - Chong Fu
- School of Computer Science and Engineering, Northeastern University, Shenyang 110819, China; Engineering Research Center of Security Technology of Complex Network System, Ministry of Education, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang 110819, China.
| | - Yu Zheng
- Department of Information Engineering, The Chinese University of Hong Kong, Sha Tin, Hong Kong Special Administrative Region.
| | - Fangyuan Zhang
- Department of General Surgery, Shengjing Hospital of China Medical University, Shenyang, China.
| | - Yanli Zhao
- School of Electrical Information Engineering, Ningxia Institute of Science and Technology, Shizuishan, 753000, China.
| | - Chiu-Wing Sham
- School of Computer Science, The University of Auckland, New Zealand.
| |
Collapse
|
96
|
Cui R, Yang R, Liu F, Cai C. N-Net: Lesion region segmentations using the generalized hybrid dilated convolutions for polyps in colonoscopy images. Front Bioeng Biotechnol 2022; 10:963590. [DOI: 10.3389/fbioe.2022.963590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 08/12/2022] [Indexed: 11/13/2022] Open
Abstract
Colorectal cancer is the cancer with the second highest and the third highest incidence rates for the female and the male, respectively. Colorectal polyps are potential prognostic indicators of colorectal cancer, and colonoscopy is the gold standard for the biopsy and the removal of colorectal polyps. In this scenario, one of the main concerns is to ensure the accuracy of lesion region identifications. However, the missing rate of polyps through manual observations in colonoscopy can reach 14%–30%. In this paper, we focus on the identifications of polyps in clinical colonoscopy images and propose a new N-shaped deep neural network (N-Net) structure to conduct the lesion region segmentations. The encoder-decoder framework is adopted in the N-Net structure and the DenseNet modules are implemented in the encoding path of the network. Moreover, we innovatively propose the strategy to design the generalized hybrid dilated convolution (GHDC), which enables flexible dilated rates and convolutional kernel sizes, to facilitate the transmission of the multi-scale information with the respective fields expanded. Based on the strategy of GHDC designing, we design four GHDC blocks to connect the encoding and the decoding paths. Through the experiments on two publicly available datasets on polyp segmentations of colonoscopy images: the Kvasir-SEG dataset and the CVC-ClinicDB dataset, the rationality and superiority of the proposed GHDC blocks and the proposed N-Net are verified. Through the comparative studies with the state-of-the-art methods, such as TransU-Net, DeepLabV3+ and CA-Net, we show that even with a small amount of network parameters, the N-Net outperforms with the Dice of 94.45%, the average symmetric surface distance (ASSD) of 0.38 pix and the mean intersection-over-union (mIoU) of 89.80% on the Kvasir-SEG dataset, and with the Dice of 97.03%, the ASSD of 0.16 pix and the mIoU of 94.35% on the CVC-ClinicDB dataset.
Collapse
|
97
|
MHA-Net: A Multibranch Hybrid Attention Network for Medical Image Segmentation. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:8375981. [PMID: 36245836 PMCID: PMC9560845 DOI: 10.1155/2022/8375981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 09/11/2022] [Accepted: 09/13/2022] [Indexed: 12/02/2022]
Abstract
The robust segmentation of organs from the medical image is the key technique in medical image analysis for disease diagnosis. U-Net is a robust structure for medical image segmentation. However, U-Net adopts consecutive downsampling encoders to capture multiscale features, resulting in the loss of contextual information and insufficient recovery of high-level semantic features. In this paper, we present a new multibranch hybrid attention network (MHA-Net) to capture more contextual information and high-level semantic features. The main idea of our proposed MHA-Net is to use the multibranch hybrid attention feature decoder to recover more high-level semantic features. The lightweight pyramid split attention (PSA) module is used to connect the encoder and decoder subnetwork to obtain a richer multiscale feature map. We compare the proposed MHA-Net to state-of-art approaches on the DRIVE dataset, the fluoroscopic roentgenographic stereophotogrammetric analysis X-ray dataset, and the polyp dataset. The experimental results on different modal images reveal that our proposed MHA-Net provides better segmentation results than other segmentation approaches.
Collapse
|
98
|
Yang H, Chen Q, Fu K, Zhu L, Jin L, Qiu B, Ren Q, Du H, Lu Y. Boosting medical image segmentation via conditional-synergistic convolution and lesion decoupling. Comput Med Imaging Graph 2022; 101:102110. [PMID: 36057184 DOI: 10.1016/j.compmedimag.2022.102110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 06/09/2022] [Accepted: 07/28/2022] [Indexed: 01/27/2023]
Abstract
Medical image segmentation is a critical step in pathology assessment and monitoring. Extensive methods tend to utilize a deep convolutional neural network for various medical segmentation tasks, such as polyp segmentation, skin lesion segmentation, etc. However, due to the inherent difficulty of medical images and tremendous data variations, they usually perform poorly in some intractable cases. In this paper, we propose an input-specific network called conditional-synergistic convolution and lesion decoupling network (CCLDNet) to solve these issues. First, in contrast to existing CNN-based methods with stationary convolutions, we propose the conditional synergistic convolution (CSConv) that aims to generate a specialist convolution kernel for each lesion. CSConv has the ability of dynamic modeling and could be leveraged as a basic block to construct other networks in a broad range of vision tasks. Second, we devise a lesion decoupling strategy (LDS) to decouple the original lesion segmentation map into two soft labels, i.e., lesion center label and lesion boundary label, for reducing the segmentation difficulty. Besides, we use a transformer network as the backbone, further erasing the fixed structure of the standard CNN and empowering dynamic modeling capability of the whole framework. Our CCLDNet outperforms state-of-the-art approaches by a large margin on a variety of benchmarks, including polyp segmentation (89.22% dice score on EndoScene) and skin lesion segmentation (91.15% dice score on ISIC2018). Our code is available at https://github.com/QianChen98/CCLD-Net.
Collapse
Affiliation(s)
- Huakun Yang
- College of Information Science and Technology, University of Science and Technology of China, Hefei 230041, China
| | - Qian Chen
- Institute of Medical Technology, Peking University Health Science Center, Peking University, Beijing 100191, China; Department of Biomedical Engineering, College of Future Technology, Peking University, Beijing 100871, China
| | - Keren Fu
- College of Computer Science, National Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University, Chengdu 610065, China
| | - Lei Zhu
- Institute of Medical Technology, Peking University Health Science Center, Peking University, Beijing 100191, China; Department of Biomedical Engineering, College of Future Technology, Peking University, Beijing 100871, China
| | - Lujia Jin
- Institute of Medical Technology, Peking University Health Science Center, Peking University, Beijing 100191, China; Department of Biomedical Engineering, College of Future Technology, Peking University, Beijing 100871, China
| | - Bensheng Qiu
- College of Information Science and Technology, University of Science and Technology of China, Hefei 230041, China
| | - Qiushi Ren
- Institute of Medical Technology, Peking University Health Science Center, Peking University, Beijing 100191, China; Department of Biomedical Engineering, College of Future Technology, Peking University, Beijing 100871, China
| | - Hongwei Du
- College of Information Science and Technology, University of Science and Technology of China, Hefei 230041, China.
| | - Yanye Lu
- Institute of Medical Technology, Peking University Health Science Center, Peking University, Beijing 100191, China.
| |
Collapse
|
99
|
Su Y, Cheng J, Yi M, Liu H. FAPN: Feature Augmented Pyramid Network for polyp segmentation. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
100
|
Polyp detection on video colonoscopy using a hybrid 2D/3D CNN. Med Image Anal 2022; 82:102625. [DOI: 10.1016/j.media.2022.102625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 08/22/2022] [Accepted: 09/10/2022] [Indexed: 12/15/2022]
|