1
|
Liu Z, Yang D, Zhang M, Liu G, Zhang Q, Li X. Inferior Alveolar Nerve Canal Segmentation on CBCT Using U-Net with Frequency Attentions. Bioengineering (Basel) 2024; 11:354. [PMID: 38671776 PMCID: PMC11048269 DOI: 10.3390/bioengineering11040354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 03/29/2024] [Accepted: 04/03/2024] [Indexed: 04/28/2024] Open
Abstract
Accurate inferior alveolar nerve (IAN) canal segmentation has been considered a crucial task in dentistry. Failing to accurately identify the position of the IAN canal may lead to nerve injury during dental procedures. While IAN canals can be detected from dental cone beam computed tomography, they are usually difficult for dentists to precisely identify as the canals are thin, small, and span across many slices. This paper focuses on improving accuracy in segmenting the IAN canals. By integrating our proposed frequency-domain attention mechanism in UNet, the proposed frequency attention UNet (FAUNet) is able to achieve 75.55% and 81.35% in the Dice and surface Dice coefficients, respectively, which are much higher than other competitive methods, by adding only 224 parameters to the classical UNet. Compared to the classical UNet, our proposed FAUNet achieves a 2.39% and 2.82% gain in the Dice coefficient and the surface Dice coefficient, respectively. The potential advantage of developing attention in the frequency domain is also discussed, which revealed that the frequency-domain attention mechanisms can achieve better performance than their spatial-domain counterparts.
Collapse
Affiliation(s)
- Zhiyang Liu
- College of Electronic Information and Optical Engineering, Nankai University, Tianjin 300350, China
- Tianjin Key Laboratory of Optoelectronic Sensor and Sensing Network Technology, College of Electronic Information and Optical Engineering, Nankai University, Tianjin 300350, China
| | - Dong Yang
- College of Electronic Information and Optical Engineering, Nankai University, Tianjin 300350, China
| | - Minghao Zhang
- College of Electronic Information and Optical Engineering, Nankai University, Tianjin 300350, China
| | - Guohua Liu
- College of Electronic Information and Optical Engineering, Nankai University, Tianjin 300350, China
- Tianjin Key Laboratory of Optoelectronic Sensor and Sensing Network Technology, College of Electronic Information and Optical Engineering, Nankai University, Tianjin 300350, China
| | - Qian Zhang
- School and Hospital of Stomatology, Tianjin Medical University, Tianjin 300070, China
| | - Xiaonan Li
- School and Hospital of Stomatology, Tianjin Medical University, Tianjin 300070, China
| |
Collapse
|
2
|
Yang H, Wang Q, Zhang Y, An Z, Liu C, Zhang X, Zhou SK. Lung Nodule Segmentation and Uncertain Region Prediction With an Uncertainty-Aware Attention Mechanism. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:1284-1295. [PMID: 37966939 DOI: 10.1109/tmi.2023.3332944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2023]
Abstract
Radiologists possess diverse training and clinical experiences, leading to variations in the segmentation annotations of lung nodules and resulting in segmentation uncertainty. Conventional methods typically select a single annotation as the learning target or attempt to learn a latent space comprising multiple annotations. However, these approaches fail to leverage the valuable information inherent in the consensus and disagreements among the multiple annotations. In this paper, we propose an Uncertainty-Aware Attention Mechanism (UAAM) that utilizes consensus and disagreements among multiple annotations to facilitate better segmentation. To this end, we introduce the Multi-Confidence Mask (MCM), which combines a Low-Confidence (LC) Mask and a High-Confidence (HC) Mask. The LC mask indicates regions with low segmentation confidence, where radiologists may have different segmentation choices. Following UAAM, we further design an Uncertainty-Guide Multi-Confidence Segmentation Network (UGMCS-Net), which contains three modules: a Feature Extracting Module that captures a general feature of a lung nodule, an Uncertainty-Aware Module that produces three features for the annotations' union, intersection, and annotation set, and an Intersection-Union Constraining Module that uses distances between the three features to balance the predictions of final segmentation and MCM. To comprehensively demonstrate the performance of our method, we propose a Complex-Nodule Validation on LIDC-IDRI, which tests UGMCS-Net's segmentation performance on lung nodules that are difficult to segment using common methods. Experimental results demonstrate that our method can significantly improve the segmentation performance on nodules that are difficult to segment using conventional methods.
Collapse
|
3
|
Fu Y, Liu J, Shi J. TSCA-Net: Transformer based spatial-channel attention segmentation network for medical images. Comput Biol Med 2024; 170:107938. [PMID: 38219644 DOI: 10.1016/j.compbiomed.2024.107938] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 12/21/2023] [Accepted: 01/01/2024] [Indexed: 01/16/2024]
Abstract
Deep learning architectures based on convolutional neural network (CNN) and Transformer have achieved great success in medical image segmentation. Models based on the encoder-decoder framework like U-Net have been successfully employed in many realistic scenarios. However, due to the low contrast between object and background, various shapes and scales of objects, and complex background in medical images, it is difficult to locate targets and obtain better segmentation performance by extracting effective information from images. In this paper, an encoder-decoder architecture based on spatial and channel attention modules built by Transformer is proposed for medical image segmentation. Concretely, spatial and channel attention modules based on Transformer are utilized to extract spatial and channel global complementary information at different layers in U-shape network, which is beneficial to learn the detail features in different scales. To fuse better spatial and channel information from Transformer features, a spatial and channel feature fusion block is designed for the decoder. The proposed network inherits the advantages of both CNN and Transformer with the local feature representation and long-range dependency for medical images. Qualitative and quantitative experiments demonstrate that the proposed method outperforms against eight state-of-the-art segmentation methods on five publicly medical image datasets including different modalities, such as 80.23% and 93.56% Dice value, 67.13% and 88.94% Intersection over Union (IoU) value on the Multi-organ Nucleus Segmentation (MoNuSeg) and Combined Healthy Abdominal Organ Segmentation with Computed Tomography scans (CHAOS-CT) datasets.
Collapse
Affiliation(s)
- Yinghua Fu
- School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, China
| | - Junfeng Liu
- School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, China
| | - Jun Shi
- School of Communication and Information Engineering, Shanghai University, Shanghai, 200444, China.
| |
Collapse
|
4
|
Xu S, Duan L, Zhang Y, Zhang Z, Sun T, Tian L. Graph- and transformer-guided boundary aware network for medical image segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 242:107849. [PMID: 37837887 DOI: 10.1016/j.cmpb.2023.107849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 09/29/2023] [Accepted: 10/06/2023] [Indexed: 10/16/2023]
Abstract
BACKGROUND AND OBJECTIVE Despite the considerable progress achieved by U-Net-based models, medical image segmentation remains a challenging task due to complex backgrounds, irrelevant noises, and ambiguous boundaries. In this study, we present a novel approach called U-shaped Graph- and Transformer-guided Boundary Aware Network (GTBA-Net) to tackle these challenges. METHODS GTBA-Net uses the pre-trained ResNet34 as its basic structure, and involves Global Feature Aggregation (GFA) modules for target localization, Graph-based Dynamic Feature Fusion (GDFF) modules for effective noise suppression, and Uncertainty-based Boundary Refinement (UBR) modules for accurate delineation of ambiguous boundaries. The GFA modules employ an efficient self-attention mechanism to facilitate coarse target localization amidst complex backgrounds, without introducing additional computational complexity. The GDFF modules leverage graph attention mechanism to aggregate information hidden among high- and low-level features, effectively suppressing target-irrelevant noises while preserving valuable spatial details. The UBR modules introduce an uncertainty quantification strategy and auxiliary loss to guide the model's focus towards target regions and uncertain "ridges", gradually mitigating boundary uncertainty and ultimately achieving accurate boundary delineation. RESULTS Comparative experiments on five datasets encompassing diverse modalities (including X-ray, CT, endoscopic procedures, and ultrasound) demonstrate that the proposed GTBA-Net outperforms existing methods in various challenging scenarios. Subsequent ablation studies further demonstrate the efficacy of the GFA, GDFF, and UBR modules in target localization, noise suppression, and ambiguous boundary delineation, respectively. CONCLUSIONS GTBA-Net exhibits substantial potential for extensive application in the field of medical image segmentation, particularly in scenarios involving complex backgrounds, target-irrelevant noises, or ambiguous boundaries.
Collapse
Affiliation(s)
- Shanshan Xu
- School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China; Beijing Key Laboratory of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing 100044, China
| | - Lianhong Duan
- The Second School of Clinical Medicine, Southern Medical University, Guangzhou, China; Senior Department of Orthopedics, The Fourth Medical Center of PLA General Hospital, Beijing, China
| | - Yang Zhang
- Senior Department of Orthopedics, The Fourth Medical Center of PLA General Hospital, Beijing, China
| | - Zhicheng Zhang
- Senior Department of Orthopedics, The Fourth Medical Center of PLA General Hospital, Beijing, China
| | - Tiansheng Sun
- The Second School of Clinical Medicine, Southern Medical University, Guangzhou, China; Senior Department of Orthopedics, The Fourth Medical Center of PLA General Hospital, Beijing, China.
| | - Lixia Tian
- School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China.
| |
Collapse
|
5
|
Jiang X, Zhu Y, Liu Y, Wang N, Yi L. MC-DC: An MLP-CNN Based Dual-path Complementary Network for Medical Image Segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 242:107846. [PMID: 37806121 DOI: 10.1016/j.cmpb.2023.107846] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 10/03/2023] [Accepted: 10/04/2023] [Indexed: 10/10/2023]
Abstract
BACKGROUND Fusing the CNN and Transformer in the encoder has recently achieved outstanding performance in medical image segmentation. However, two obvious limitations require addressing: (1) The utilization of Transformer leads to heavy parameters, and its intricate structure demands ample data and resources for training, and (2) most previous research had predominantly focused on enhancing the performance of the feature encoder, with little emphasis placed on the design of the feature decoder. METHODS To this end, we propose a novel MLP-CNN based dual-path complementary (MC-DC) network for medical image segmentation, which replaces the complex Transformer with a cost-effective Multi-Layer Perceptron (MLP). Specifically, a dual-path complementary (DPC) module is designed to effectively fuse multi-level features from MLP and CNN. To respectively reconstruct global and local information, the dual-path decoder is proposed which is mainly composed of cross-scale global feature fusion (CS-GF) module and cross-scale local feature fusion (CS-LF) module. Moreover, we leverage a simple and efficient segmentation mask feature fusion (SMFF) module to merge the segmentation outcomes generated by the dual-path decoder. RESULTS Comprehensive experiments were performed on three typical medical image segmentation tasks. For skin lesions segmentation, our MC-DC network achieved 91.69% Dice and 9.52mm ASSD on the ISIC2018 dataset. In addition, the 91.6% Dice and 94.4% Dice were respectively obtained on the Kvasir-SEG dataset and CVC-ClinicDB dataset for polyp segmentation. Moreover, we also conducted experiments on the private COVID-DS36 dataset for lung lesion segmentation. Our MC-DC has achieved 87.6% [87.1%, 88.1%], and 92.3% [91.8%, 92.7%] on ground-glass opacity, interstitial infiltration, and lung consolidation, respectively. CONCLUSIONS The experimental results indicate that the proposed MC-DC network exhibits exceptional generalization capability and surpasses other state-of-the-art methods in higher results and lower computational complexity.
Collapse
Affiliation(s)
- Xiaoben Jiang
- School of Information Science and Technology, East China University of Science and Technology, Shanghai, 200237, China
| | - Yu Zhu
- School of Information Science and Technology, East China University of Science and Technology, Shanghai, 200237, China.
| | - Yatong Liu
- School of Information Science and Technology, East China University of Science and Technology, Shanghai, 200237, China
| | - Nan Wang
- School of Information Science and Technology, East China University of Science and Technology, Shanghai, 200237, China
| | - Lei Yi
- Department of Burn, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
| |
Collapse
|
6
|
Wang L, Ye M, Lu Y, Qiu Q, Niu Z, Shi H, Wang J. A combined encoder-transformer-decoder network for volumetric segmentation of adrenal tumors. Biomed Eng Online 2023; 22:106. [PMID: 37940921 PMCID: PMC10631161 DOI: 10.1186/s12938-023-01160-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 09/25/2023] [Indexed: 11/10/2023] Open
Abstract
BACKGROUND The morphology of the adrenal tumor and the clinical statistics of the adrenal tumor area are two crucial diagnostic and differential diagnostic features, indicating precise tumor segmentation is essential. Therefore, we build a CT image segmentation method based on an encoder-decoder structure combined with a Transformer for volumetric segmentation of adrenal tumors. METHODS This study included a total of 182 patients with adrenal metastases, and an adrenal tumor volumetric segmentation method combining encoder-decoder structure and Transformer was constructed. The Dice Score coefficient (DSC), Hausdorff distance, Intersection over union (IOU), Average surface distance (ASD) and Mean average error (MAE) were calculated to evaluate the performance of the segmentation method. RESULTS Analyses were made among our proposed method and other CNN-based and transformer-based methods. The results showed excellent segmentation performance, with a mean DSC of 0.858, a mean Hausdorff distance of 10.996, a mean IOU of 0.814, a mean MAE of 0.0005, and a mean ASD of 0.509. The boxplot of all test samples' segmentation performance implies that the proposed method has the lowest skewness and the highest average prediction performance. CONCLUSIONS Our proposed method can directly generate 3D lesion maps and showed excellent segmentation performance. The comparison of segmentation metrics and visualization results showed that our proposed method performed very well in the segmentation.
Collapse
Affiliation(s)
- Liping Wang
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, Zhejiang, China
| | - Mingtao Ye
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, Zhejiang, China
| | - Yanjie Lu
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, Zhejiang, China
| | - Qicang Qiu
- Zhejiang Lab, No. 1818, Western Road of Wenyi, Hangzhou, Zhejiang, China.
| | - Zhongfeng Niu
- Department of Radiology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Hengfeng Shi
- Department of Radiology, Anqing Municipal Hospital, Anqing, Anhui, China
| | - Jian Wang
- Department of Radiology, Tongde Hospital of Zhejiang Province, No.234, Gucui Road, Hangzhou, Zhejiang, China.
| |
Collapse
|
7
|
Wang Y, Cui W, Yu T, Li X, Liao X, Li Y. Dynamic Multi-Graph Convolution-Based Channel-Weighted Transformer Feature Fusion Network for Epileptic Seizure Prediction. IEEE Trans Neural Syst Rehabil Eng 2023; 31:4266-4277. [PMID: 37782584 DOI: 10.1109/tnsre.2023.3321414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/04/2023]
Abstract
Electroencephalogram (EEG) based seizure prediction plays an important role in the closed-loop neuromodulation system. However, most existing seizure prediction methods based on graph convolution network only focused on constructing the static graph, ignoring multi-domain dynamic changes in deep graph structure. Moreover, the existing feature fusion strategies generally concatenated coarse-grained epileptic EEG features directly, leading to the suboptimal seizure prediction performance. To address these issues, we propose a novel multi-branch dynamic multi-graph convolution based channel-weighted transformer feature fusion network (MB-dMGC-CWTFFNet) for the patient-specific seizure prediction with the superior performance. Specifically, a multi-branch (MB) feature extractor is first applied to capture the temporal, spatial and spectral representations fromthe epileptic EEG jointly. Then, we design a point-wise dynamic multi-graph convolution network (dMGCN) to dynamically learn deep graph structures, which can effectively extract high-level features from the multi-domain graph. Finally, by integrating the local and global channel-weighted strategies with the multi-head self-attention mechanism, a channel-weighted transformer feature fusion network (CWTFFNet) is adopted to efficiently fuse the multi-domain graph features. The proposed MB-dMGC-CWTFFNet is evaluated on the public CHB-MIT EEG and a private intracranial sEEG datasets, and the experimental results demonstrate that our proposed method achieves outstanding prediction performance compared with the state-of-the-art methods, indicating an effective tool for patient-specific seizure warning. Our code will be available at: https://github.com/Rockingsnow/MB-dMGC-CWTFFNet.
Collapse
|
8
|
AL Qurri A, Almekkawy M. Improved UNet with Attention for Medical Image Segmentation. SENSORS (BASEL, SWITZERLAND) 2023; 23:8589. [PMID: 37896682 PMCID: PMC10611347 DOI: 10.3390/s23208589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 10/01/2023] [Accepted: 10/13/2023] [Indexed: 10/29/2023]
Abstract
Medical image segmentation is crucial for medical image processing and the development of computer-aided diagnostics. In recent years, deep Convolutional Neural Networks (CNNs) have been widely adopted for medical image segmentation and have achieved significant success. UNet, which is based on CNNs, is the mainstream method used for medical image segmentation. However, its performance suffers owing to its inability to capture long-range dependencies. Transformers were initially designed for Natural Language Processing (NLP), and sequence-to-sequence applications have demonstrated the ability to capture long-range dependencies. However, their abilities to acquire local information are limited. Hybrid architectures of CNNs and Transformer, such as TransUNet, have been proposed to benefit from Transformer's long-range dependencies and CNNs' low-level details. Nevertheless, automatic medical image segmentation remains a challenging task due to factors such as blurred boundaries, the low-contrast tissue environment, and in the context of ultrasound, issues like speckle noise and attenuation. In this paper, we propose a new model that combines the strengths of both CNNs and Transformer, with network architectural improvements designed to enrich the feature representation captured by the skip connections and the decoder. To this end, we devised a new attention module called Three-Level Attention (TLA). This module is composed of an Attention Gate (AG), channel attention, and spatial normalization mechanism. The AG preserves structural information, whereas channel attention helps to model the interdependencies between channels. Spatial normalization employs the spatial coefficient of the Transformer to improve spatial attention akin to TransNorm. To further improve the skip connection and reduce the semantic gap, skip connections between the encoder and decoder were redesigned in a manner similar to that of the UNet++ dense connection. Moreover, deep supervision using a side-output channel was introduced, analogous to BASNet, which was originally used for saliency predictions. Two datasets from different modalities, a CT scan dataset and an ultrasound dataset, were used to evaluate the proposed UNet architecture. The experimental results showed that our model consistently improved the prediction performance of the UNet across different datasets.
Collapse
|
9
|
Radiya K, Joakimsen HL, Mikalsen KØ, Aahlin EK, Lindsetmo RO, Mortensen KE. Performance and clinical applicability of machine learning in liver computed tomography imaging: a systematic review. Eur Radiol 2023; 33:6689-6717. [PMID: 37171491 PMCID: PMC10511359 DOI: 10.1007/s00330-023-09609-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 02/02/2023] [Accepted: 02/06/2023] [Indexed: 05/13/2023]
Abstract
OBJECTIVES Machine learning (ML) for medical imaging is emerging for several organs and image modalities. Our objectives were to provide clinicians with an overview of this field by answering the following questions: (1) How is ML applied in liver computed tomography (CT) imaging? (2) How well do ML systems perform in liver CT imaging? (3) What are the clinical applications of ML in liver CT imaging? METHODS A systematic review was carried out according to the guidelines from the PRISMA-P statement. The search string focused on studies containing content relating to artificial intelligence, liver, and computed tomography. RESULTS One hundred ninety-one studies were included in the study. ML was applied to CT liver imaging by image analysis without clinicians' intervention in majority of studies while in newer studies the fusion of ML method with clinical intervention have been identified. Several were documented to perform very accurately on reliable but small data. Most models identified were deep learning-based, mainly using convolutional neural networks. Potentially many clinical applications of ML to CT liver imaging have been identified through our review including liver and its lesion segmentation and classification, segmentation of vascular structure inside the liver, fibrosis and cirrhosis staging, metastasis prediction, and evaluation of chemotherapy. CONCLUSION Several studies attempted to provide transparent result of the model. To make the model convenient for a clinical application, prospective clinical validation studies are in urgent call. Computer scientists and engineers should seek to cooperate with health professionals to ensure this. KEY POINTS • ML shows great potential for CT liver image tasks such as pixel-wise segmentation and classification of liver and liver lesions, fibrosis staging, metastasis prediction, and retrieval of relevant liver lesions from similar cases of other patients. • Despite presenting the result is not standardized, many studies have attempted to provide transparent results to interpret the machine learning method performance in the literature. • Prospective studies are in urgent call for clinical validation of ML method, preferably carried out by cooperation between clinicians and computer scientists.
Collapse
Affiliation(s)
- Keyur Radiya
- Department of Gastroenterological Surgery at University Hospital of North Norway (UNN), Tromso, Norway.
- Department of Clinical Medicine, UiT The Arctic University of Norway, Tromso, Norway.
| | - Henrik Lykke Joakimsen
- Institute of Clinical Medicine, UiT The Arctic University of Norway, Tromso, Norway
- Centre for Clinical Artificial Intelligence (SPKI), University Hospital of North Norway, Tromso, Norway
| | - Karl Øyvind Mikalsen
- Department of Clinical Medicine, UiT The Arctic University of Norway, Tromso, Norway
- Centre for Clinical Artificial Intelligence (SPKI), University Hospital of North Norway, Tromso, Norway
- UiT Machine Learning Group, Department of Physics and Technology, UiT the Arctic University of Norway, Tromso, Norway
| | - Eirik Kjus Aahlin
- Department of Gastroenterological Surgery at University Hospital of North Norway (UNN), Tromso, Norway
| | - Rolv-Ole Lindsetmo
- Department of Clinical Medicine, UiT The Arctic University of Norway, Tromso, Norway
- Head Clinic of Surgery, Oncology and Women Health, University Hospital of North Norway, Tromso, Norway
| | - Kim Erlend Mortensen
- Department of Gastroenterological Surgery at University Hospital of North Norway (UNN), Tromso, Norway
- Department of Clinical Medicine, UiT The Arctic University of Norway, Tromso, Norway
| |
Collapse
|
10
|
Pan H, Gao B, Bai W, Li B, Li Y, Zhang M, Wang H, Zhao X, Chen M, Yin C, Kong W. WA-ResUNet: A Focused Tail Class MRI Medical Image Segmentation Algorithm. Bioengineering (Basel) 2023; 10:945. [PMID: 37627829 PMCID: PMC10451191 DOI: 10.3390/bioengineering10080945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 07/28/2023] [Accepted: 08/04/2023] [Indexed: 08/27/2023] Open
Abstract
Medical image segmentation can effectively identify lesions in medicine, but some small and rare lesions cannot be well identified. Existing studies do not take into account the uncertainty of the occurrence of diseased tissue, and the problem of long-tailed distribution of medical data. Meanwhile, the grayscale image obtained from Magnetic Resonance Imaging (MRI) detection has problems, such as the features being difficult to extract and invalid features being difficult to distinguish. In order to solve these problems, we propose a new weighted attention ResUNet (WA-ResUNet) and a class weight formula based on the number of images contained in the class, which improves the performance of the model in the low-frequency class and the overall effect of the model by improving the degree of attention paid to the valid features and invalid ones and rebalancing the learning efficiency among the classes. We evaluated our method on an uterine MRI dataset and compared it with the ResUNet. WA-ResUNet increased Intersection over Union (IoU) in the low-frequency class (Nabothian cysts) by 21.87%, and the overall mIoU increased by more than 6.5%.
Collapse
Affiliation(s)
- Haixia Pan
- College of Software, Beihang University, Beijing 100191, China
| | - Bo Gao
- College of Software, Beihang University, Beijing 100191, China
| | - Wenpei Bai
- Department of Obstetrics and Gynecology, Beijing Shijitan Hospital, Capital Medical University, Beijing 100038, China
| | - Bin Li
- Department of MRI, Beijing Shijitan Hospital, Capital Medical University/Ninth Clinical Medical College, Peking University, Beijing 100038, China
| | - Yanan Li
- College of Software, Beihang University, Beijing 100191, China
| | - Meng Zhang
- College of Software, Beihang University, Beijing 100191, China
| | - Hongqiang Wang
- College of Software, Beihang University, Beijing 100191, China
| | - Xiaoran Zhao
- College of Software, Beihang University, Beijing 100191, China
| | - Minghuang Chen
- Department of Obstetrics and Gynecology, Beijing Shijitan Hospital, Capital Medical University, Beijing 100038, China
| | - Cong Yin
- Department of Obstetrics and Gynecology, Beijing Shijitan Hospital, Capital Medical University, Beijing 100038, China
| | - Weiya Kong
- Department of Obstetrics and Gynecology, Beijing Shijitan Hospital, Capital Medical University, Beijing 100038, China
| |
Collapse
|
11
|
Zhou H, Sun C, Huang H, Fan M, Yang X, Zhou L. Feature-guided attention network for medical image segmentation. Med Phys 2023; 50:4871-4886. [PMID: 36746870 DOI: 10.1002/mp.16253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 01/03/2023] [Accepted: 01/06/2023] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND U-Net and its variations have achieved remarkable performances in medical image segmentation. However, they have two limitations. First, the shallow layer feature of the encoder always contains background noise. Second, semantic gaps exist between the features of the encoder and the decoder. Skip-connections directly connect the encoder to the decoder, which will lead to the fusion of semantically dissimilar feature maps. PURPOSE To overcome these two limitations, this paper proposes a novel medical image segmentation algorithm, called feature-guided attention network, which consists of U-Net, the cross-level attention filtering module (CAFM), and the attention-guided upsampling module (AUM). METHODS In the proposed method, the AUM and the CAFM were introduced into the U-Net, where the AUM learns to filter the background noise in the low-level feature map of the encoder and the CAFM tries to eliminate the semantic gap between the encoder and the decoder. Specifically, the AUM adopts a top-down pathway to use the high-level feature map so as to filter the background noise in the low-level feature map of the encoder. The AUM uses the encoder features to guide the upsampling of the corresponding decoder features, thus eliminating the semantic gap between them. Four medical image segmentation tasks, including coronary atherosclerotic plaque segmentation (Dataset A), retinal vessel segmentation (Dataset B), skin lesion segmentation (Dataset C), and multiclass retinal edema lesions segmentation (Dataset D), were used to validate the proposed method. RESULTS For Dataset A, the proposed method achieved higher Intersection over Union (IoU) (67.91 ± 3.82 % $67.91\pm 3.82\%$ ), dice (79.39 ± 3.37 % $79.39\pm 3.37\%$ ), accuracy (98.39 ± 0.34 % $98.39\pm 0.34\%$ ), and sensitivity (85.10 ± 3.74 % $85.10\pm 3.74\%$ ) than the previous best method: CA-Net. For Dataset B, the proposed method achieved higher sensitivity (83.50%) and accuracy (97.55%) than the previous best method: SCS-Net. For Dataset C, the proposed method had highest IoU (83.47 ± 0.41 % $83.47\pm 0.41\%$ ) and dice (90.81 ± 0.34 % $90.81\pm 0.34\%$ ) than those of all compared previous methods. For Dataset D, the proposed method had highest dice (average: 81.53%; retina edema area [REA]: 83.78%; pigment epithelial detachment [PED] 77.13%), sensitivity (REA: 89.01%; SRF: 85.50%), specificity (REA: 99.35%; PED: 100.00), and accuracy (98.73%) among all compared previous networks. In addition, the number of parameters of the proposed method was 2.43 M, which is less than CA-Net (3.21 M) and CPF-Net (3.07 M). CONCLUSIONS The proposed method demonstrated state-of-the-art performance, outperforming other top-notch medical image segmentation algorithms. The CAFM filtered the background noise in the low-level feature map of the encoder, while the AUM eliminated the semantic gap between the encoder and the decoder. Furthermore, the proposed method was of high computational efficiency.
Collapse
Affiliation(s)
- Hao Zhou
- National Key Laboratory of Science and Technology of Underwater Vehicle, Harbin Engineering University, Harbin, China
| | - Chaoyu Sun
- Fourth Affiliated Hospital, Harbin Medical University, Harbin, China
| | - Hai Huang
- National Key Laboratory of Science and Technology of Underwater Vehicle, Harbin Engineering University, Harbin, China
| | - Mingyu Fan
- College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, China
| | - Xu Yang
- State Key Laboratory of Management and Control for Complex System, Institute of Automation, Chinese Academy of Sciences, Beijing, China
| | - Linxiao Zhou
- Fourth Affiliated Hospital, Harbin Medical University, Harbin, China
| |
Collapse
|
12
|
Lakshmipriya B, Pottakkat B, Ramkumar G. Deep learning techniques in liver tumour diagnosis using CT and MR imaging - A systematic review. Artif Intell Med 2023; 141:102557. [PMID: 37295904 DOI: 10.1016/j.artmed.2023.102557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 04/15/2023] [Accepted: 04/18/2023] [Indexed: 06/12/2023]
Abstract
Deep learning has become a thriving force in the computer aided diagnosis of liver cancer, as it solves extremely complicated challenges with high accuracy over time and facilitates medical experts in their diagnostic and treatment procedures. This paper presents a comprehensive systematic review on deep learning techniques applied for various applications pertaining to liver images, challenges faced by the clinicians in liver tumour diagnosis and how deep learning bridges the gap between clinical practice and technological solutions with an in-depth summary of 113 articles. Since, deep learning is an emerging revolutionary technology, recent state-of-the-art research implemented on liver images are reviewed with more focus on classification, segmentation and clinical applications in the management of liver diseases. Additionally, similar review articles in literature are reviewed and compared. The review is concluded by presenting the contemporary trends and unaddressed research issues in the field of liver tumour diagnosis, offering directions for future research in this field.
Collapse
Affiliation(s)
- B Lakshmipriya
- Department of Surgical Gastroenterology, Jawaharlal Institute of Postgraduate Medical Education and Research, Puducherry, India
| | - Biju Pottakkat
- Department of Surgical Gastroenterology, Jawaharlal Institute of Postgraduate Medical Education and Research, Puducherry, India.
| | - G Ramkumar
- Department of Radio Diagnosis, Jawaharlal Institute of Postgraduate Medical Education and Research, Puducherry, India
| |
Collapse
|
13
|
Zhan B, Song E, Liu H. FSA-Net: Rethinking the attention mechanisms in medical image segmentation from releasing global suppressed information. Comput Biol Med 2023; 161:106932. [PMID: 37230013 DOI: 10.1016/j.compbiomed.2023.106932] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 03/28/2023] [Accepted: 04/13/2023] [Indexed: 05/27/2023]
Abstract
Attention mechanism-based medical image segmentation methods have developed rapidly recently. For the attention mechanisms, it is crucial to accurately capture the distribution weights of the effective features contained in the data. To accomplish this task, most attention mechanisms prefer using the global squeezing approach. However, it will lead to a problem of over-focusing on the global most salient effective features of the region of interest, while suppressing the secondary salient ones. Making partial fine-grained features are abandoned directly. To address this issue, we propose to use a multiple-local perception method to aggregate global effective features, and design a fine-grained medical image segmentation network, named FSA-Net. This network consists of two key components: 1) the novel Separable Attention Mechanisms which replace global squeezing with local squeezing to release the suppressed secondary salient effective features. 2) a Multi-Attention Aggregator (MAA) which can fuse multi-level attention to efficiently aggregate task-relevant semantic information. We conduct extensive experimental evaluations on five publicly available medical image segmentation datasets: MoNuSeg, COVID-19-CT100, GlaS, CVC-ClinicDB, ISIC2018, and DRIVE datasets. Experimental results show that FSA-Net outperforms state-of-the-art methods in medical image segmentation.
Collapse
Affiliation(s)
- Bangcheng Zhan
- School of Computer Science & Technology, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Enmin Song
- School of Computer Science & Technology, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China.
| | - Hong Liu
- School of Computer Science & Technology, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| |
Collapse
|
14
|
Shi J, Sun B, Ye X, Wang Z, Luo X, Liu J, Gao H, Li H. Semantic Decomposition Network With Contrastive and Structural Constraints for Dental Plaque Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:935-946. [PMID: 36367911 DOI: 10.1109/tmi.2022.3221529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Segmenting dental plaque from images of medical reagent staining provides valuable information for diagnosis and the determination of follow-up treatment plan. However, accurate dental plaque segmentation is a challenging task that requires identifying teeth and dental plaque subjected to semantic-blur regions (i.e., confused boundaries in border regions between teeth and dental plaque) and complex variations of instance shapes, which are not fully addressed by existing methods. Therefore, we propose a semantic decomposition network (SDNet) that introduces two single-task branches to separately address the segmentation of teeth and dental plaque and designs additional constraints to learn category-specific features for each branch, thus facilitating the semantic decomposition and improving the performance of dental plaque segmentation. Specifically, SDNet learns two separate segmentation branches for teeth and dental plaque in a divide-and-conquer manner to decouple the entangled relation between them. Each branch that specifies a category tends to yield accurate segmentation. To help these two branches better focus on category-specific features, two constraint modules are further proposed: 1) contrastive constraint module (CCM) to learn discriminative feature representations by maximizing the distance between different category representations, so as to reduce the negative impact of semantic-blur regions on feature extraction; 2) structural constraint module (SCM) to provide complete structural information for dental plaque of various shapes by the supervision of an boundary-aware geometric constraint. Besides, we construct a large-scale open-source Stained Dental Plaque Segmentation dataset (SDPSeg), which provides high-quality annotations for teeth and dental plaque. Experimental results on SDPSeg datasets show SDNet achieves state-of-the-art performance.
Collapse
|
15
|
Pan S, Liu X, Xie N, Chong Y. EG-TransUNet: a transformer-based U-Net with enhanced and guided models for biomedical image segmentation. BMC Bioinformatics 2023; 24:85. [PMID: 36882688 PMCID: PMC9989586 DOI: 10.1186/s12859-023-05196-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 02/20/2023] [Indexed: 03/09/2023] Open
Abstract
Although various methods based on convolutional neural networks have improved the performance of biomedical image segmentation to meet the precision requirements of medical imaging segmentation task, medical image segmentation methods based on deep learning still need to solve the following problems: (1) Difficulty in extracting the discriminative feature of the lesion region in medical images during the encoding process due to variable sizes and shapes; (2) difficulty in fusing spatial and semantic information of the lesion region effectively during the decoding process due to redundant information and the semantic gap. In this paper, we used the attention-based Transformer during the encoder and decoder stages to improve feature discrimination at the level of spatial detail and semantic location by its multihead-based self-attention. In conclusion, we propose an architecture called EG-TransUNet, including three modules improved by a transformer: progressive enhancement module, channel spatial attention, and semantic guidance attention. The proposed EG-TransUNet architecture allowed us to capture object variabilities with improved results on different biomedical datasets. EG-TransUNet outperformed other methods on two popular colonoscopy datasets (Kvasir-SEG and CVC-ClinicDB) by achieving 93.44% and 95.26% on mDice. Extensive experiments and visualization results demonstrate that our method advances the performance on five medical segmentation datasets with better generalization ability.
Collapse
Affiliation(s)
- Shaoming Pan
- The State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan, China
| | - Xin Liu
- The State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan, China
| | - Ningdi Xie
- The State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan, China
| | - Yanwen Chong
- The State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan, China.
| |
Collapse
|
16
|
Guo W, Yang G, Li G, Ruan L, Liu K, Li Q. Remote sensing identification of green plastic cover in urban built-up areas. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:37055-37075. [PMID: 36565426 DOI: 10.1007/s11356-022-24911-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Accepted: 12/18/2022] [Indexed: 06/17/2023]
Abstract
Urban renewal can transform areas that are not adapted to modern urban life, allowing them to redevelop and flourish; however, the renewal process generates many new construction sites, producing environmentally harmful construction dust. The widespread use of urban green plastic cover (GPC) at construction sites and the development of high-resolution satellites have made it possible to extract the spatial distribution of construction sites and provide a basis for environmental protection authorities to protect against dust sources. Existing GPC extraction methods based on remote sensing images are either difficult to obtain the exact boundary of GPC or cannot provide corresponding algorithms according to different application scenarios. In order to determine the distribution of green plastic cover in the built-up area, this paper selects a variety of typical machine learning algorithms to classify the land cover of the test area image and selects K-nearest neighbor as the best machine learning algorithm through accuracy evaluation. Then multiple deep learning methods were used and the top networks with high overall scores were selected by comparing various aspects. Then these networks were used to predict the GPC of the test area image, and the accuracy evaluation results showed that the segmentation accuracy of deep learning was much higher than that of machine learning methods, but it took more time to predict. Therefore, combining different application scenarios, this paper gives the corresponding suggested methods for GPC extraction.
Collapse
Affiliation(s)
- Wenkai Guo
- China Three Gorges Corporation, Wuhan, 430010, China.
| | - Guoxing Yang
- China Three Gorges Corporation, Wuhan, 430010, China
| | - Guangchao Li
- College of Geoscience and Surveying Engineering, China University of Mining & Technology, Beijing, 100083, China
| | - Lin Ruan
- China Three Gorges Corporation, Wuhan, 430010, China
| | - Kun Liu
- China Three Gorges Corporation, Wuhan, 430010, China
| | - Qirong Li
- China Three Gorges Corporation, Wuhan, 430010, China
| |
Collapse
|
17
|
Jiang Y, Dong J, Zhang Y, Cheng T, Lin X, Liang J. PCF-Net: Position and context information fusion attention convolutional neural network for skin lesion segmentation. Heliyon 2023; 9:e13942. [PMID: 36923881 PMCID: PMC10009446 DOI: 10.1016/j.heliyon.2023.e13942] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 02/10/2023] [Accepted: 02/15/2023] [Indexed: 02/27/2023] Open
Abstract
Skin lesion segmentation is a crucial step in the process of skin cancer diagnosis and treatment. The variation in position, shape, size and edges of skin lesion areas poses a challenge for accurate segmentation of skin lesion areas through dermoscopic images. To meet these challenges, in this paper, using UNet as the baseline model, a convolutional neural network based on position and context information fusion attention is proposed, called PCF-Net. A novel two-branch attention mechanism is designed to aggregate Position and Context information, called Position and Context Information Aggregation Attention Module (PCFAM). A global context information complementary module (GCCM) was developed to obtain long-range dependencies. A multi-scale grouped dilated convolution feature extraction module (MSEM) was proposed to capture multi-scale feature information and place it in the bottleneck of UNet. On the ISIC2018 dataset, a large volume of ablation experiments demonstrated the superiority of PCF-Net for dermoscopic image segmentation after adding PCFAM, GCCM and MSEM. Compared with other state-of-the-art methods, the performance of PCF-Net achieves a competitive result in all metrics.
Collapse
Affiliation(s)
- Yun Jiang
- College of Computer Science and Engineering, Northwest Normal University, Lanzhou, China
| | - Jinkun Dong
- College of Computer Science and Engineering, Northwest Normal University, Lanzhou, China
| | - Yuan Zhang
- College of Computer Science and Engineering, Northwest Normal University, Lanzhou, China
| | - Tongtong Cheng
- College of Computer Science and Engineering, Northwest Normal University, Lanzhou, China
| | - Xin Lin
- College of Computer Science and Engineering, Northwest Normal University, Lanzhou, China
| | - Jing Liang
- College of Computer Science and Engineering, Northwest Normal University, Lanzhou, China
| |
Collapse
|
18
|
Jiang Y, Dong J, Cheng T, Zhang Y, Lin X, Liang J. iU-Net: a hybrid structured network with a novel feature fusion approach for medical image segmentation. BioData Min 2023; 16:5. [PMID: 36805687 PMCID: PMC9942350 DOI: 10.1186/s13040-023-00320-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2022] [Accepted: 01/04/2023] [Indexed: 02/23/2023] Open
Abstract
In recent years, convolutional neural networks (CNNs) have made great achievements in the field of medical image segmentation, especially full convolutional neural networks based on U-shaped structures and skip connections. However, limited by the inherent limitations of convolution, CNNs-based methods usually exhibit limitations in modeling long-range dependencies and are unable to extract large amounts of global contextual information, which deprives neural networks of the ability to adapt to different visual modalities. In this paper, we propose our own model, which is called iU-Net bacause its structure closely resembles the combination of i and U. iU-Net is a multiple encoder-decoder structure combining Swin Transformer and CNN. We use a hierarchical Swin Transformer structure with shifted windows as the primary encoder and convolution as the secondary encoder to complement the context information extracted by the primary encoder. To sufficiently fuse the feature information extracted from multiple encoders, we design a feature fusion module (W-FFM) based on wave function representation. Besides, a three branch up sampling method(Tri-Upsample) has developed to replace the patch expand in the Swin Transformer, which can effectively avoid the Checkerboard Artifacts caused by the patch expand. On the skin lesion region segmentation task, the segmentation performance of iU-Net is optimal, with Dice and Iou reaching 90.12% and 83.06%, respectively. To verify the generalization of iU-Net, we used the model trained on ISIC2018 dataset to test on PH2 dataset, and achieved 93.80% Dice and 88.74% IoU. On the lung feild segmentation task, the iU-Net achieved optimal results on IoU and Precision, reaching 98.54% and 94.35% respectively. Extensive experiments demonstrate the segmentation performance and generalization ability of iU-Net.
Collapse
Affiliation(s)
- Yun Jiang
- grid.412260.30000 0004 1760 1427College of Computer Science and Engineering, Northwest Normal University, Lanzhou, China
| | - Jinkun Dong
- College of Computer Science and Engineering, Northwest Normal University, Lanzhou, China.
| | - Tongtong Cheng
- grid.412260.30000 0004 1760 1427College of Computer Science and Engineering, Northwest Normal University, Lanzhou, China
| | - Yuan Zhang
- grid.412260.30000 0004 1760 1427College of Computer Science and Engineering, Northwest Normal University, Lanzhou, China
| | - Xin Lin
- grid.412260.30000 0004 1760 1427College of Computer Science and Engineering, Northwest Normal University, Lanzhou, China
| | - Jing Liang
- grid.412260.30000 0004 1760 1427College of Computer Science and Engineering, Northwest Normal University, Lanzhou, China
| |
Collapse
|
19
|
Zhang B, Wang Y, Ding C, Deng Z, Li L, Qin Z, Ding Z, Bian L, Yang C. Multi-scale feature pyramid fusion network for medical image segmentation. Int J Comput Assist Radiol Surg 2023; 18:353-365. [PMID: 36042149 DOI: 10.1007/s11548-022-02738-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 08/11/2022] [Indexed: 02/03/2023]
Abstract
PURPOSE Medical image segmentation is the most widely used technique in diagnostic and clinical research. However, accurate segmentation of target organs from blurred border regions and low-contrast adjacent organs in Computed tomography (CT) imaging is crucial for clinical diagnosis and treatment. METHODS In this article, we propose a Multi-Scale Feature Pyramid Fusion Network (MS-Net) based on the codec structure formed by the combination of Multi-Scale Attention Module (MSAM) and Stacked Feature Pyramid Module (SFPM). Among them, MSAM is used to skip connections, which aims to extract different levels of context details by dynamically adjusting the receptive fields under different network depths; the SFPM including multi-scale strategies and multi-layer Feature Perception Module (FPM) is nested in the network at the deepest point, which aims to better focus the network's attention on the target organ by adaptively increasing the weight of the features of interest. RESULTS Experiments demonstrate that the proposed MS-Net significantly improved the Dice score from 91.74% to 94.54% on CHAOS, from 97.59% to 98.59% on Lung, and from 82.55% to 86.06% on ISIC 2018, compared with U-Net. Additionally, comparisons with other six state-of-the-art codec structures also show the presented network has great advantages on evaluation indicators such as Miou, Dice, ACC and AUC. CONCLUSION The experimental results show that both the MSAM and SFPM techniques proposed in this paper can assist the network to improve the segmentation effect, so that the proposed MS-Net method achieves better results in the CHAOS, Lung and ISIC 2018 segmentation tasks.
Collapse
Affiliation(s)
- Bing Zhang
- Power Systems Engineering Research Center, Ministry of Education, College of Big Data and Information Engineering, Guizhou University, Guiyang, 550025, China
| | - Yang Wang
- Power Systems Engineering Research Center, Ministry of Education, College of Big Data and Information Engineering, Guizhou University, Guiyang, 550025, China
| | - Caifu Ding
- Power Systems Engineering Research Center, Ministry of Education, College of Big Data and Information Engineering, Guizhou University, Guiyang, 550025, China
| | - Ziqing Deng
- Power Systems Engineering Research Center, Ministry of Education, College of Big Data and Information Engineering, Guizhou University, Guiyang, 550025, China
| | - Linwei Li
- Power Systems Engineering Research Center, Ministry of Education, College of Big Data and Information Engineering, Guizhou University, Guiyang, 550025, China
| | - Zesheng Qin
- Power Systems Engineering Research Center, Ministry of Education, College of Big Data and Information Engineering, Guizhou University, Guiyang, 550025, China
| | - Zhao Ding
- Power Systems Engineering Research Center, Ministry of Education, College of Big Data and Information Engineering, Guizhou University, Guiyang, 550025, China
| | - Lifeng Bian
- Frontier Institute of Chip and System, Fudan University, Shanghai, 200433, China.
| | - Chen Yang
- Power Systems Engineering Research Center, Ministry of Education, College of Big Data and Information Engineering, Guizhou University, Guiyang, 550025, China.
| |
Collapse
|
20
|
Kushnure DT, Tyagi S, Talbar SN. LiM-Net: Lightweight multi-level multiscale network with deep residual learning for automatic liver segmentation in CT images. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
21
|
Zheng J, Liu H, Feng Y, Xu J, Zhao L. CASF-Net: Cross-attention and cross-scale fusion network for medical image segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 229:107307. [PMID: 36571889 DOI: 10.1016/j.cmpb.2022.107307] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 11/22/2022] [Accepted: 12/09/2022] [Indexed: 06/18/2023]
Abstract
BACKGROUND Automatic segmentation of medical images has progressed greatly owing to the development of convolutional neural networks (CNNs). However, there are two uncertainties with current approaches based on convolutional operations: (1) how to eliminate the general limitations that CNNs lack the ability of modeling long-range dependencies and global contextual interactions, and (2) how to efficiently discover and integrate global and local features that are implied in the image. Notably, these two problems are interconnected, yet previous approaches mainly focus on the first problem and ignore the importance of information integration. METHODS In this paper, we propose a novel cross-attention and cross-scale fusion network (CASF-Net), which aims to explicitly tap the potential of dual-branch networks and fully integrate the coarse and fine-grained feature representations. Specifically, the well-designed dual-branch encoder hammers at modeling non-local dependencies and multi-scale contexts, significantly improving the quality of semantic segmentation. Moreover, the proposed cross-attention and cross-scale module efficiently perform multi-scale information fusion, being capable of further exploring the long-range contextual information. RESULTS Extensive experiments conducted on three different types of medical image segmentation tasks demonstrate the state-of-the-art performance of our proposed method both visually and numerically. CONCLUSIONS This paper assembles the feature representation capabilities of CNN and transformer and proposes cross-attention and cross-scale fusion algorithms. The promising results show new possibilities of using cross-fusion mechanisms in more downstream medical image tasks.
Collapse
Affiliation(s)
- Jianwei Zheng
- College of Computer Science and Engineering, Zhejiang University of Technology, Hangzhou 310014, China.
| | - Hao Liu
- College of Computer Science and Engineering, Zhejiang University of Technology, Hangzhou 310014, China
| | - Yuchao Feng
- College of Computer Science and Engineering, Zhejiang University of Technology, Hangzhou 310014, China
| | - Jinshan Xu
- College of Computer Science and Engineering, Zhejiang University of Technology, Hangzhou 310014, China
| | - Liang Zhao
- Stomatological Hospital of Xiamen Medical College and the Xiamen Key Laboratory of Stomatological Disease Diagnosis and Treatment, Xiamen 361000, China.
| |
Collapse
|
22
|
Liu X, Shen H, Gao L, Guo R. Lung parenchyma segmentation based on semantic data augmentation and boundary attention consistency. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
23
|
Philippi D, Rothaus K, Castelli M. A vision transformer architecture for the automated segmentation of retinal lesions in spectral domain optical coherence tomography images. Sci Rep 2023; 13:517. [PMID: 36627357 PMCID: PMC9832034 DOI: 10.1038/s41598-023-27616-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 01/04/2023] [Indexed: 01/12/2023] Open
Abstract
Neovascular age-related macular degeneration (nAMD) is one of the major causes of irreversible blindness and is characterized by accumulations of different lesions inside the retina. AMD biomarkers enable experts to grade the AMD and could be used for therapy prognosis and individualized treatment decisions. In particular, intra-retinal fluid (IRF), sub-retinal fluid (SRF), and pigment epithelium detachment (PED) are prominent biomarkers for grading neovascular AMD. Spectral-domain optical coherence tomography (SD-OCT) revolutionized nAMD early diagnosis by providing cross-sectional images of the retina. Automatic segmentation and quantification of IRF, SRF, and PED in SD-OCT images can be extremely useful for clinical decision-making. Despite the excellent performance of convolutional neural network (CNN)-based methods, the task still presents some challenges due to relevant variations in the location, size, shape, and texture of the lesions. This work adopts a transformer-based method to automatically segment retinal lesion from SD-OCT images and qualitatively and quantitatively evaluate its performance against CNN-based methods. The method combines the efficient long-range feature extraction and aggregation capabilities of Vision Transformers with data-efficient training of CNNs. The proposed method was tested on a private dataset containing 3842 2-dimensional SD-OCT retina images, manually labeled by experts of the Franziskus Eye-Center, Muenster. While one of the competitors presents a better performance in terms of Dice score, the proposed method is significantly less computationally expensive. Thus, future research will focus on the proposed network's architecture to increase its segmentation performance while maintaining its computational efficiency.
Collapse
Affiliation(s)
- Daniel Philippi
- grid.10772.330000000121511713NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, 1070-312 Lisbon, Portugal
| | - Kai Rothaus
- grid.416655.5Department of Ophthalmology, St. Franziskus Hospital, 48145 Muenster, Germany
| | - Mauro Castelli
- NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, 1070-312, Lisbon, Portugal. .,School of Economics and Business, University of Ljubljana, Ljubljana, Slovenia.
| |
Collapse
|
24
|
Wang X, Wang S, Zhang Z, Yin X, Wang T, Li N. CPAD-Net: Contextual parallel attention and dilated network for liver tumor segmentation. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
25
|
Huang X, Chen J, Chen M, Chen L, Wan Y. TDD-UNet:Transformer with double decoder UNet for COVID-19 lesions segmentation. Comput Biol Med 2022; 151:106306. [PMID: 36403357 PMCID: PMC9664702 DOI: 10.1016/j.compbiomed.2022.106306] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2022] [Revised: 10/22/2022] [Accepted: 11/06/2022] [Indexed: 11/09/2022]
Abstract
The outbreak of new coronary pneumonia has brought severe health risks to the world. Detection of COVID-19 based on the UNet network has attracted widespread attention in medical image segmentation. However, the traditional UNet model is challenging to capture the long-range dependence of the image due to the limitations of the convolution kernel with a fixed receptive field. The Transformer Encoder overcomes the long-range dependence problem. However, the Transformer-based segmentation approach cannot effectively capture the fine-grained details. We propose a transformer with a double decoder UNet for COVID-19 lesions segmentation to address this challenge, TDD-UNet. We introduce the multi-head self-attention of the Transformer to the UNet encoding layer to extract global context information. The dual decoder structure is used to improve the result of foreground segmentation by predicting the background and applying deep supervision. We performed quantitative analysis and comparison for our proposed method on four public datasets with different modalities, including CT and CXR, to demonstrate its effectiveness and generality in segmenting COVID-19 lesions. We also performed ablation studies on the COVID-19-CT-505 dataset to verify the effectiveness of the key components of our proposed model. The proposed TDD-UNet also achieves higher Dice and Jaccard mean scores and the lowest standard deviation compared to competitors. Our proposed method achieves better segmentation results than other state-of-the-art methods.
Collapse
Affiliation(s)
- Xuping Huang
- Computer School, University of South China, Hengyang 421001, China
| | - Junxi Chen
- Affiliated Nanhua Hospital, University of South China, Hengyang 421001, China
| | - Mingzhi Chen
- College of Mechanical and Vehicle Engineering, Hunan University, Changsha 410082, China
| | - Lingna Chen
- Computer School, University of South China, Hengyang 421001, China,Corresponding author
| | - Yaping Wan
- Computer School, University of South China, Hengyang 421001, China,Corresponding author
| |
Collapse
|
26
|
Chen Y, Zheng C, Hu F, Zhou T, Feng L, Xu G, Yi Z, Zhang X. Efficient two-step liver and tumour segmentation on abdominal CT via deep learning and a conditional random field. Comput Biol Med 2022; 150:106076. [PMID: 36137320 DOI: 10.1016/j.compbiomed.2022.106076] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 08/20/2022] [Accepted: 09/03/2022] [Indexed: 11/24/2022]
Abstract
Segmentation of the liver and tumours from computed tomography (CT) scans is an important task in hepatic surgical planning. Manual segmentation of the liver and tumours is a time-consuming and labour-intensive task; therefore, a fully automated method for performing this segmentation is particularly desired. An automatic two-step liver and tumour segmentation method is presented in this paper. A cascade framework is used during the segmentation process, and a fully connected conditional random field (CRF) method is used to refine the tumour segmentation result. First, the proposed fractal residual U-Net (FRA-UNet) is used to locate and initially segment the liver. Then, FRA-UNet is further used to predict liver tumours from the liver region of interest (ROI). Finally, a three-dimensional (3D) CRF is used to refine the tumour segmentation results. The improved fractal residual (FR) structure effectively retains more effective features for improving the segmentation performance of deeper networks, the improved deep residual block can utilise the feature information more effectively, and the 3D CRF method smooths the contours and avoids the tumour oversegmentation problem. FRA-UNet is tested on the Liver Tumour Segmentation challenge dataset (LiTS) and the 3D Image Reconstruction for Comparison of Algorithm Database dataset (3DIRCADb), achieving 97.13% and 97.18% Dice similarity coefficients (DSCs) for liver segmentation and 71.78% and 68.97% DSCs for tumour segmentation, respectively, outperforming most state-of-the-art networks.
Collapse
Affiliation(s)
- Ying Chen
- School of Software, Nanchang Hangkong University, Nanchang, 330063, China.
| | - Cheng Zheng
- School of Software, Nanchang Hangkong University, Nanchang, 330063, China.
| | - Fei Hu
- School of Software, Nanchang Hangkong University, Nanchang, 330063, China.
| | - Taohui Zhou
- School of Software, Nanchang Hangkong University, Nanchang, 330063, China.
| | - Longfeng Feng
- School of Software, Nanchang Hangkong University, Nanchang, 330063, China.
| | - Guohui Xu
- Department of Liver Neoplasms, Jiangxi Cancer Hospital, Nanchang, 330029, China.
| | - Zhen Yi
- Department of Radiology, Jiangxi Cancer Hospital, Nanchang, 330029, China.
| | - Xiang Zhang
- Wenzhou Data Management and Development Group Co.,Ltd, Wenzhou, Zhejiang, 325000, China.
| |
Collapse
|
27
|
Jiang S, Li J. TransCUNet: UNet cross fused transformer for medical image segmentation. Comput Biol Med 2022; 150:106207. [PMID: 37859294 DOI: 10.1016/j.compbiomed.2022.106207] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2022] [Revised: 09/20/2022] [Accepted: 10/09/2022] [Indexed: 11/21/2022]
Abstract
Accurate segmentation of medical images is crucial for clinical diagnosis and evaluation. However, medical images have complex shapes, the structures of different objects are very different, and most medical datasets are small in scale, making it difficult to train effectively. These problems increase the difficulty of automatic segmentation. To further improve the segmentation performance of the model, we propose a multi-branch network model, called TransCUNet, for segmenting medical images of different modalities. The model contains three structures: cross residual fusion block (CRFB), pyramidal pooling module (PPM) and gated axial-attention, which achieve effective extraction of high-level and low-level features of images, while showing high robustness to different size segmentation objects and different scale datasets. In our experiments, we use four datasets to train, validate and test the models. The experimental results show that TransCUNet has better segmentation performance compared to the current mainstream segmentation methods, and the model has a smaller size and number of parameters, which has great potential for clinical applications.
Collapse
Affiliation(s)
- Shen Jiang
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai 264005, China
| | - Jinjiang Li
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai 264005, China.
| |
Collapse
|
28
|
Wang J, Tian S, Yu L, Wang Y, Wang F, Zhou Z. SBDF-Net: A versatile dual-branch fusion network for medical image segmentation. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
29
|
Jiang S, Li J, Hua Z. Transformer with progressive sampling for medical cellular image segmentation. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2022; 19:12104-12126. [PMID: 36653988 DOI: 10.3934/mbe.2022563] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
The convolutional neural network, as the backbone network for medical image segmentation, has shown good performance in the past years. However, its drawbacks cannot be ignored, namely, convolutional neural networks focus on local regions and are difficult to model global contextual information. For this reason, transformer, which is used for text processing, was introduced into the field of medical segmentation, and thanks to its expertise in modelling global relationships, the accuracy of medical segmentation was further improved. However, the transformer-based network structure requires a certain training set size to achieve satisfactory segmentation results, and most medical segmentation datasets are small in size. Therefore, in this paper we introduce a gated position-sensitive axial attention mechanism in the self-attention module, so that the transformer-based network structure can also be adapted to the case of small datasets. The common operation of the visual transformer introduced to visual processing when dealing with segmentation tasks is to divide the input image into equal patches of the same size and then perform visual processing on each patch, but this simple division may lead to the destruction of the structure of the original image, and there may be large unimportant regions in the divided grid, causing attention to stay on the uninteresting regions, affecting the segmentation performance. Therefore, in this paper, we add iterative sampling to update the sampling positions, so that the attention stays on the region to be segmented, reducing the interference of irrelevant regions and further improving the segmentation performance. In addition, we introduce the strip convolution module (SCM) and pyramid pooling module (PPM) to capture the global contextual information. The proposed network is evaluated on several datasets and shows some improvement in segmentation accuracy compared to networks of recent years.
Collapse
Affiliation(s)
- Shen Jiang
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai 264005, China
| | - Jinjiang Li
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai 264005, China
| | - Zhen Hua
- School of Information and Electronic Engineering, Shandong Technology and Business University, Yantai 264005, China
| |
Collapse
|
30
|
Zhao Q, Jia Q, Chi T. Deep learning as a novel method for endoscopic diagnosis of chronic atrophic gastritis: a prospective nested case-control study. BMC Gastroenterol 2022; 22:352. [PMID: 35879649 PMCID: PMC9310473 DOI: 10.1186/s12876-022-02427-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/15/2022] [Accepted: 07/15/2022] [Indexed: 11/28/2022] Open
Abstract
Background and aims Chronic atrophic gastritis (CAG) is a precancerous disease that often leads to the development of gastric cancer (GC) and is positively correlated with GC morbidity. However, the sensitivity of the endoscopic diagnosis of CAG is only 42%. Therefore, we developed a real-time video monitoring model for endoscopic diagnosis of CAG based on U-Net deep learning (DL) and conducted a prospective nested case–control study to evaluate the diagnostic evaluation indices of the model and its consistency with pathological diagnosis.
Methods Our cohort consisted of 1539 patients undergoing gastroscopy from December 1, 2020, to July 1, 2021. Based on pathological diagnosis, patients in the cohort were divided into the CAG group or the chronic nonatrophic gastritis (CNAG) group, and we assessed the diagnostic evaluation indices of this model and its consistency with pathological diagnosis after propensity score matching (PSM) to minimize selection bias in the study. Results After matching, the diagnostic evaluation indices and consistency evaluation of the model were better than those of endoscopists [sensitivity (84.02% vs. 62.72%), specificity (97.04% vs. 81.95%), positive predictive value (96.60% vs. 77.66%), negative predictive value (85.86% vs. 68.73%), accuracy rate (90.53% vs. 72.34%), Youden index (81.06% vs. 44.67%), odd product (172.5 vs. 7.64), positive likelihood ratio (28.39 vs. 3.47), negative likelihood ratio (0.16 vs. 0.45), AUC (95% CI) [0.909 (0.884–0.934) vs. 0.740 (0.702–0.778)] and Kappa (0.852 vs. 0.558)]. Conclusions Our prospective nested case–control study proved that the diagnostic evaluation indices and consistency evaluation of the real-time video monitoring model for endoscopic diagnosis of CAG based on U-Net DL were superior to those of endoscopists. Trial registrationChiCTR2100044458, 18/03/2020.
Collapse
Affiliation(s)
- Quchuan Zhao
- Department of Gastroenterology, Xuanwu Hospital of Capital Medical University, 45 Chang-chun Street, Beijing, 100053, China
| | - Qing Jia
- Department of Anesthesiology, Guang'anmen Hospital China Academy of Chinese Medical Sciences, 5 North Court Street, Beijing, 100053, China.
| | - Tianyu Chi
- Department of Gastroenterology, Xuanwu Hospital of Capital Medical University, 45 Chang-chun Street, Beijing, 100053, China.
| |
Collapse
|
31
|
ECAU-Net: Efficient channel attention U-Net for fetal ultrasound cerebellum segmentation. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103528] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
32
|
Yang R, Yu J, Yin J, Liu K, Xu S. An FA-SegNet Image Segmentation Model Based on Fuzzy Attention and Its Application in Cardiac MRI Segmentation. INT J COMPUT INT SYS 2022. [DOI: 10.1007/s44196-022-00080-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
AbstractAiming at the medical images segmentation with low-recognition and high background noise, a deep convolution neural network image segmentation model based on fuzzy attention mechanism is proposed, which is called FA-SegNet. It takes SegNet as the basic framework. In the down-sampling module for image feature extraction, a fuzzy channel-attention module is added to strengthen the discrimination of different target regions. In the up-sampling module for image size restoration and multi-scale feature fusion, a fuzzy spatial-attention module is added to reduce the loss of image details and expand the receptive field. In this paper, fuzzy cognition is introduced into the feature fusion of CNNs. Based on the attention mechanism, fuzzy membership is used to re-calibrate the importance of the pixel value in local regions. It can strengthen the distinguishing ability of image features, and the fusion ability of the contextual information, which improves the segmentation accuracy of the target regions. Taking MRI segmentation as an experimental example, multiple targets such as the left ventricles, right ventricles, and left ventricular myocardium are selected as the segmentation targets. The pixels accuracy is 92.47%, the mean intersection to union is 86.18%, and the Dice coefficient is 92.44%, which are improved compared with other methods. It verifies the accuracy and applicability of the proposed method for the medical images segmentation, especially the targets with low-recognition and serious occlusion.
Collapse
|
33
|
Zhang C, Lu J, Hua Q, Li C, Wang P. SAA-Net: U-shaped network with Scale-Axis-Attention for liver tumor segmentation. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2021.103460] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
|
34
|
|
35
|
RMS-UNet: Residual multi-scale UNet for liver and lesion segmentation. Artif Intell Med 2022; 124:102231. [DOI: 10.1016/j.artmed.2021.102231] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Revised: 11/12/2021] [Accepted: 12/17/2021] [Indexed: 12/12/2022]
|
36
|
Ahn JC, Qureshi TA, Singal AG, Li D, Yang JD. Deep learning in hepatocellular carcinoma: Current status and future perspectives. World J Hepatol 2021; 13:2039-2051. [PMID: 35070007 PMCID: PMC8727204 DOI: 10.4254/wjh.v13.i12.2039] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 07/19/2021] [Accepted: 11/15/2021] [Indexed: 02/06/2023] Open
Abstract
Hepatocellular carcinoma (HCC) is among the leading causes of cancer incidence and death. Despite decades of research and development of new treatment options, the overall outcomes of patients with HCC continue to remain poor. There are areas of unmet need in risk prediction, early diagnosis, accurate prognostication, and individualized treatments for patients with HCC. Recent years have seen an explosive growth in the application of artificial intelligence (AI) technology in medical research, with the field of HCC being no exception. Among the various AI-based machine learning algorithms, deep learning algorithms are considered state-of-the-art techniques for handling and processing complex multimodal data ranging from routine clinical variables to high-resolution medical images. This article will provide a comprehensive review of the recently published studies that have applied deep learning for risk prediction, diagnosis, prognostication, and treatment planning for patients with HCC.
Collapse
Affiliation(s)
- Joseph C Ahn
- Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN 55904, United States
| | - Touseef Ahmad Qureshi
- Biomedical Imaging Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, United States
| | - Amit G Singal
- Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, United States
| | - Debiao Li
- Biomedical Imaging Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, United States
| | - Ju-Dong Yang
- Karsh Division of Gastroenterology and Hepatology, Cedars-Sinai Medical Center, Los Angeles, CA 90048, United States
| |
Collapse
|
37
|
Chi J, Han X, Wu C, Wang H, Ji P. X-Net: Multi-branch UNet-like network for liver and tumor segmentation from 3D abdominal CT scans. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.06.021] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
38
|
Rosas-Gonzalez S, Birgui-Sekou T, Hidane M, Zemmoura I, Tauber C. Asymmetric Ensemble of Asymmetric U-Net Models for Brain Tumor Segmentation With Uncertainty Estimation. Front Neurol 2021; 12:609646. [PMID: 34659077 PMCID: PMC8515181 DOI: 10.3389/fneur.2021.609646] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2020] [Accepted: 07/22/2021] [Indexed: 11/29/2022] Open
Abstract
Accurate brain tumor segmentation is crucial for clinical assessment, follow-up, and subsequent treatment of gliomas. While convolutional neural networks (CNN) have become state of the art in this task, most proposed models either use 2D architectures ignoring 3D contextual information or 3D models requiring large memory capacity and extensive learning databases. In this study, an ensemble of two kinds of U-Net-like models based on both 3D and 2.5D convolutions is proposed to segment multimodal magnetic resonance images (MRI). The 3D model uses concatenated data in a modified U-Net architecture. In contrast, the 2.5D model is based on a multi-input strategy to extract low-level features from each modality independently and on a new 2.5D Multi-View Inception block that aims to merge features from different views of a 3D image aggregating multi-scale features. The Asymmetric Ensemble of Asymmetric U-Net (AE AU-Net) based on both is designed to find a balance between increasing multi-scale and 3D contextual information extraction and keeping memory consumption low. Experiments on 2019 dataset show that our model improves enhancing tumor sub-region segmentation. Overall, performance is comparable with state-of-the-art results, although with less learning data or memory requirements. In addition, we provide voxel-wise and structure-wise uncertainties of the segmentation results, and we have established qualitative and quantitative relationships between uncertainty and prediction errors. Dice similarity coefficient for the whole tumor, tumor core, and tumor enhancing regions on BraTS 2019 validation dataset were 0.902, 0.815, and 0.773. We also applied our method in BraTS 2018 with corresponding Dice score values of 0.908, 0.838, and 0.800.
Collapse
Affiliation(s)
| | | | - Moncef Hidane
- LIFAT EA 6300, INSA Centre Val de Loire, Université de Tours, Tours, France
| | - Ilyess Zemmoura
- UMR Inserm U1253, iBrain, Université de Tours, Inserm, Tours, France
| | - Clovis Tauber
- UMR Inserm U1253, iBrain, Université de Tours, Inserm, Tours, France
| |
Collapse
|
39
|
Zhang C, Lu J, Yang L, Li C. CAAGP: Rethinking channel attention with adaptive global pooling for liver tumor segmentation. Comput Biol Med 2021; 138:104875. [PMID: 34563854 DOI: 10.1016/j.compbiomed.2021.104875] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 09/14/2021] [Accepted: 09/14/2021] [Indexed: 11/16/2022]
Abstract
Channel attention, a channel-wise method often used in computer vision tasks, including liver tumor segmentation tasks, is able to model the channel relationship to augment the representation ability of feature maps. Channel attention could adaptively generate channel-wise responses using global pooling, which aggregates spatial information roughly. Actually, global pooling may introduce the loss of fine information, which is vital for segmentation tasks. Hence, we rethink the problem and propose the channel attention with adaptive global pooling(short for CAAGP), which preserves spatial and fine-grained information for liver tumor segmentation tasks when channel attention is generated. The model consists of three main parts, including improved self-attention, adaptive global pooling and responses generation modules. Self-attention achieves excellent performance in the computing of the spatial attention, while introducing serious calculation and memory burdens. In order to remedy these burdens, we improve self-attention and consider aggregating spatial information from x and y directions respectively. Extensive experiments have been conducted to verify the effectiveness of our proposed method. Our CAAGP outperforms other attention mechanisms significantly in liver tumor segmentation, especially for tumors with small size.
Collapse
Affiliation(s)
- Chi Zhang
- School of Information Science and Engineering, Southeast University, China.
| | - Jingben Lu
- School of Information Science and Engineering, Southeast University, China.
| | - Luxi Yang
- School of Information Science and Engineering, Southeast University, China.
| | - Chunguo Li
- School of Information Science and Engineering, Southeast University, China.
| |
Collapse
|
40
|
Abstract
ABSTRACT Artificial intelligence is poised to revolutionize medical image. It takes advantage of the high-dimensional quantitative features present in medical images that may not be fully appreciated by humans. Artificial intelligence has the potential to facilitate automatic organ segmentation, disease detection and characterization, and prediction of disease recurrence. This article reviews the current status of artificial intelligence in liver imaging and reviews the opportunities and challenges in clinical implementation.
Collapse
|
41
|
Huang D, Bai H, Wang L, Hou Y, Li L, Xia Y, Yan Z, Chen W, Chang L, Li W. The Application and Development of Deep Learning in Radiotherapy: A Systematic Review. Technol Cancer Res Treat 2021; 20:15330338211016386. [PMID: 34142614 PMCID: PMC8216350 DOI: 10.1177/15330338211016386] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
With the massive use of computers, the growth and explosion of data has greatly promoted the development of artificial intelligence (AI). The rise of deep learning (DL) algorithms, such as convolutional neural networks (CNN), has provided radiation oncologists with many promising tools that can simplify the complex radiotherapy process in the clinical work of radiation oncology, improve the accuracy and objectivity of diagnosis, and reduce the workload, thus enabling clinicians to spend more time on advanced decision-making tasks. As the development of DL gets closer to clinical practice, radiation oncologists will need to be more familiar with its principles to properly evaluate and use this powerful tool. In this paper, we explain the development and basic concepts of AI and discuss its application in radiation oncology based on different task categories of DL algorithms. This work clarifies the possibility of further development of DL in radiation oncology.
Collapse
Affiliation(s)
- Danju Huang
- Department of Radiation Oncology, 531840The Third Affiliated Hospital of Kunming Medical University, Yunnan Cancer Hospital, Kunming, Yunnan, China
| | - Han Bai
- Department of Radiation Oncology, 531840The Third Affiliated Hospital of Kunming Medical University, Yunnan Cancer Hospital, Kunming, Yunnan, China
| | - Li Wang
- Department of Radiation Oncology, 531840The Third Affiliated Hospital of Kunming Medical University, Yunnan Cancer Hospital, Kunming, Yunnan, China
| | - Yu Hou
- Department of Radiation Oncology, 531840The Third Affiliated Hospital of Kunming Medical University, Yunnan Cancer Hospital, Kunming, Yunnan, China
| | - Lan Li
- Department of Radiation Oncology, 531840The Third Affiliated Hospital of Kunming Medical University, Yunnan Cancer Hospital, Kunming, Yunnan, China
| | - Yaoxiong Xia
- Department of Radiation Oncology, 531840The Third Affiliated Hospital of Kunming Medical University, Yunnan Cancer Hospital, Kunming, Yunnan, China
| | - Zhirui Yan
- Department of Radiation Oncology, 531840The Third Affiliated Hospital of Kunming Medical University, Yunnan Cancer Hospital, Kunming, Yunnan, China
| | - Wenrui Chen
- Department of Radiation Oncology, 531840The Third Affiliated Hospital of Kunming Medical University, Yunnan Cancer Hospital, Kunming, Yunnan, China
| | - Li Chang
- Department of Radiation Oncology, 531840The Third Affiliated Hospital of Kunming Medical University, Yunnan Cancer Hospital, Kunming, Yunnan, China
| | - Wenhui Li
- Department of Radiation Oncology, 531840The Third Affiliated Hospital of Kunming Medical University, Yunnan Cancer Hospital, Kunming, Yunnan, China
| |
Collapse
|
42
|
MS-UNet: A multi-scale UNet with feature recalibration approach for automatic liver and tumor segmentation in CT images. Comput Med Imaging Graph 2021; 89:101885. [DOI: 10.1016/j.compmedimag.2021.101885] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Revised: 01/22/2021] [Accepted: 01/24/2021] [Indexed: 01/22/2023]
|
43
|
Tong X, Wei J, Sun B, Su S, Zuo Z, Wu P. ASCU-Net: Attention Gate, Spatial and Channel Attention U-Net for Skin Lesion Segmentation. Diagnostics (Basel) 2021; 11:501. [PMID: 33809048 PMCID: PMC7999819 DOI: 10.3390/diagnostics11030501] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Revised: 03/08/2021] [Accepted: 03/09/2021] [Indexed: 01/29/2023] Open
Abstract
Segmentation of skin lesions is a challenging task because of the wide range of skin lesion shapes, sizes, colors, and texture types. In the past few years, deep learning networks such as U-Net have been successfully applied to medical image segmentation and exhibited faster and more accurate performance. In this paper, we propose an extended version of U-Net for the segmentation of skin lesions using the concept of the triple attention mechanism. We first selected regions using attention coefficients computed by the attention gate and contextual information. Second, a dual attention decoding module consisting of spatial attention and channel attention was used to capture the spatial correlation between features and improve segmentation performance. The combination of the three attentional mechanisms helped the network to focus on a more relevant field of view of the target. The proposed model was evaluated using three datasets, ISIC-2016, ISIC-2017, and PH2. The experimental results demonstrated the effectiveness of our method with strong robustness to the presence of irregular borders, lesion and skin smooth transitions, noise, and artifacts.
Collapse
Affiliation(s)
| | - Junyu Wei
- College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, China; (X.T.); (B.S.); (S.S.); (Z.Z.); (P.W.)
| | | | | | | | | |
Collapse
|
44
|
Tran ST, Cheng CH, Nguyen TT, Le MH, Liu DG. TMD-Unet: Triple-Unet with Multi-Scale Input Features and Dense Skip Connection for Medical Image Segmentation. Healthcare (Basel) 2021; 9:54. [PMID: 33419018 PMCID: PMC7825313 DOI: 10.3390/healthcare9010054] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Revised: 12/29/2020] [Accepted: 01/02/2021] [Indexed: 11/18/2022] Open
Abstract
Deep learning is one of the most effective approaches to medical image processing applications. Network models are being studied more and more for medical image segmentation challenges. The encoder-decoder structure is achieving great success, in particular the Unet architecture, which is used as a baseline architecture for the medical image segmentation networks. Traditional Unet and Unet-based networks still have a limitation that is not able to fully exploit the output features of the convolutional units in the node. In this study, we proposed a new network model named TMD-Unet, which had three main enhancements in comparison with Unet: (1) modifying the interconnection of the network node, (2) using dilated convolution instead of the standard convolution, and (3) integrating the multi-scale input features on the input side of the model and applying a dense skip connection instead of a regular skip connection. Our experiments were performed on seven datasets, including many different medical image modalities such as colonoscopy, electron microscopy (EM), dermoscopy, computed tomography (CT), and magnetic resonance imaging (MRI). The segmentation applications implemented in the paper include EM, nuclei, polyp, skin lesion, left atrium, spleen, and liver segmentation. The dice score of our proposed models achieved 96.43% for liver segmentation, 95.51% for spleen segmentation, 92.65% for polyp segmentation, 94.11% for EM segmentation, 92.49% for nuclei segmentation, 91.81% for left atrium segmentation, and 87.27% for skin lesion segmentation. The experimental results showed that the proposed model was superior to the popular models for all seven applications, which demonstrates the high generality of the proposed model.
Collapse
Affiliation(s)
- Song-Toan Tran
- Program of Electrical and Communications Engineering, Feng Chia University, Taichung 40724, Taiwan; (T.-T.N.); (M.-H.L.); (D.-G.L.)
- Department of Electrical and Electronics, Tra Vinh University, Tra Vinh 87000, Vietnam
| | - Ching-Hwa Cheng
- Department of Electronic Engineering, Feng Chia University, Taichung 40724, Taiwan;
| | - Thanh-Tuan Nguyen
- Program of Electrical and Communications Engineering, Feng Chia University, Taichung 40724, Taiwan; (T.-T.N.); (M.-H.L.); (D.-G.L.)
| | - Minh-Hai Le
- Program of Electrical and Communications Engineering, Feng Chia University, Taichung 40724, Taiwan; (T.-T.N.); (M.-H.L.); (D.-G.L.)
- Department of Electrical and Electronics, Tra Vinh University, Tra Vinh 87000, Vietnam
| | - Don-Gey Liu
- Program of Electrical and Communications Engineering, Feng Chia University, Taichung 40724, Taiwan; (T.-T.N.); (M.-H.L.); (D.-G.L.)
- Department of Electronic Engineering, Feng Chia University, Taichung 40724, Taiwan;
| |
Collapse
|
45
|
Yao C, Tang J, Hu M, Wu Y, Guo W, Li Q, Zhang XP. Claw U-Net: A UNet Variant Network with Deep Feature Concatenation for Scleral Blood Vessel Segmentation. ARTIF INTELL 2021. [DOI: 10.1007/978-3-030-93049-3_6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
46
|
Debs N, Cho TH, Rousseau D, Berthezène Y, Buisson M, Eker O, Mechtouff L, Nighoghossian N, Ovize M, Frindel C. Impact of the reperfusion status for predicting the final stroke infarct using deep learning. NEUROIMAGE-CLINICAL 2020; 29:102548. [PMID: 33450521 PMCID: PMC7810765 DOI: 10.1016/j.nicl.2020.102548] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 12/15/2020] [Accepted: 12/20/2020] [Indexed: 01/10/2023]
Abstract
BACKGROUND Predictive maps of the final infarct may help therapeutic decisions in acute ischemic stroke patients. Our objectives were to assess whether integrating the reperfusion status into deep learning models would improve their performance, and to compare them to current clinical prediction methods. METHODS We trained and tested convolutional neural networks (CNNs) to predict the final infarct in acute ischemic stroke patients treated by thrombectomy in our center. When training the CNNs, non-reperfused patients from a non-thrombectomized cohort were added to the training set to increase the size of this group. Baseline diffusion and perfusion-weighted magnetic resonance imaging (MRI) were used as inputs, and the lesion segmented on day-6 MRI served as the ground truth for the final infarct. The cohort was dichotomized into two subsets, reperfused and non-reperfused patients, from which reperfusion status specific CNNs were developed and compared to one another, and to the clinically-used perfusion-diffusion mismatch model. Evaluation metrics included the Dice similarity coefficient (DSC), precision, recall, volumetric similarity, Hausdorff distance and area-under-the-curve (AUC). RESULTS We analyzed 109 patients, including 35 without reperfusion. The highest DSC were achieved in both reperfused and non-reperfused patients (DSC = 0.44 ± 0.25 and 0.47 ± 0.17, respectively) when using the corresponding reperfusion status-specific CNN. CNN-based models achieved higher DSC and AUC values compared to those of perfusion-diffusion mismatch models (reperfused patients: AUC = 0.87 ± 0.13 vs 0.79 ± 0.17, P < 0.001; non-reperfused patients: AUC = 0.81 ± 0.13 vs 0.73 ± 0.14, P < 0.01, in CNN vs perfusion-diffusion mismatch models, respectively). CONCLUSION The performance of deep learning models improved when the reperfusion status was incorporated in their training. CNN-based models outperformed the clinically-used perfusion-diffusion mismatch model. Comparing the predicted infarct in case of successful vs failed reperfusion may help in estimating the treatment effect and guiding therapeutic decisions in selected patients.
Collapse
Affiliation(s)
- Noëlie Debs
- CREATIS, CNRS, UMR-5220, INSERM U1206, Université Lyon 1, INSA Lyon, Villeurbanne, France.
| | - Tae-Hee Cho
- CREATIS, CNRS, UMR-5220, INSERM U1206, Université Lyon 1, INSA Lyon, Villeurbanne, France; Department of Vascular Neurology, Hospices Civils de Lyon, Lyon, France.
| | - David Rousseau
- LARIS, UMR IRHS INRA, Université d'Angers, Angers, France.
| | - Yves Berthezène
- CREATIS, CNRS, UMR-5220, INSERM U1206, Université Lyon 1, INSA Lyon, Villeurbanne, France; Department of Neuroradiology, Hospices Civils de Lyon, Lyon, France.
| | - Marielle Buisson
- Department of Cardiology, Clinical Investigation Center, CarMeN INSERM U1060, INRA U1397, INSA Lyon, Université Lyon 1, Hospices Civils de Lyon, Lyon, France.
| | - Omer Eker
- CREATIS, CNRS, UMR-5220, INSERM U1206, Université Lyon 1, INSA Lyon, Villeurbanne, France; Department of Neuroradiology, Hospices Civils de Lyon, Lyon, France.
| | - Laura Mechtouff
- Department of Vascular Neurology, Hospices Civils de Lyon, Lyon, France; Department of Cardiology, Clinical Investigation Center, CarMeN INSERM U1060, INRA U1397, INSA Lyon, Université Lyon 1, Hospices Civils de Lyon, Lyon, France.
| | - Norbert Nighoghossian
- CREATIS, CNRS, UMR-5220, INSERM U1206, Université Lyon 1, INSA Lyon, Villeurbanne, France; Department of Vascular Neurology, Hospices Civils de Lyon, Lyon, France.
| | - Michel Ovize
- Department of Neuroradiology, Hospices Civils de Lyon, Lyon, France.
| | - Carole Frindel
- CREATIS, CNRS, UMR-5220, INSERM U1206, Université Lyon 1, INSA Lyon, Villeurbanne, France.
| |
Collapse
|