1
|
Lai J, Luo Z, Liu J, Hu H, Jiang H, Liu P, He L, Cheng W, Ren W, Wu Y, Piao JG, Wu Z. Charged Gold Nanoparticles for Target Identification-Alignment and Automatic Segmentation of CT Image-Guided Adaptive Radiotherapy in Small Hepatocellular Carcinoma. NANO LETTERS 2024; 24:10614-10623. [PMID: 39046153 PMCID: PMC11363118 DOI: 10.1021/acs.nanolett.4c02823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Revised: 07/19/2024] [Accepted: 07/22/2024] [Indexed: 07/25/2024]
Abstract
Because of the challenges posed by anatomical uncertainties and the low resolution of plain computed tomography (CT) scans, implementing adaptive radiotherapy (ART) for small hepatocellular carcinoma (sHCC) using artificial intelligence (AI) faces obstacles in tumor identification-alignment and automatic segmentation. The current study aims to improve sHCC imaging for ART using a gold nanoparticle (Au NP)-based CT contrast agent to enhance AI-driven automated image processing. The synthesized charged Au NPs demonstrated notable in vitro aggregation, low cytotoxicity, and minimal organ toxicity. Over time, an in situ sHCC mouse model was established for in vivo CT imaging at multiple time points. The enhanced CT images processed using 3D U-Net and 3D Trans U-Net AI models demonstrated high geometric and dosimetric accuracy. Therefore, charged Au NPs enable accurate and automatic sHCC segmentation in CT images using classical AI models, potentially addressing the technical challenges related to tumor identification, alignment, and automatic segmentation in CT-guided online ART.
Collapse
Affiliation(s)
- Jianjun Lai
- Department
of Radiation Oncology, Zhejiang Hospital, Hangzhou 310013, China
- Instiute
of Intelligent Control and Robotics, Hangzhou
Dianzi University, Hangzhou 310018, China
| | - Zhizeng Luo
- Instiute
of Intelligent Control and Robotics, Hangzhou
Dianzi University, Hangzhou 310018, China
| | - Jiping Liu
- Department
of Radiation Physics, Zhejiang Cancer Hospital, Hangzhou 310022, China
| | - Haili Hu
- Department
of Radiation Oncology, Zhejiang Hospital, Hangzhou 310013, China
| | - Hao Jiang
- Department
of Radiation Oncology, Zhejiang Hospital, Hangzhou 310013, China
| | - Pengyuan Liu
- Department
of Radiation Oncology, Zhejiang Hospital, Hangzhou 310013, China
| | - Li He
- School
of Pharmaceutical Sciences, Zhejiang Chinese
Medical University, Hangzhou 310053, China
| | - Weiyi Cheng
- School
of Pharmaceutical Sciences, Zhejiang Chinese
Medical University, Hangzhou 310053, China
| | - Weiye Ren
- School
of Pharmaceutical Sciences, Zhejiang Chinese
Medical University, Hangzhou 310053, China
| | - Yajun Wu
- Department
of Pharmacy, Zhejiang Hospital, Hangzhou 310013, China
| | - Ji-Gang Piao
- School
of Pharmaceutical Sciences, Zhejiang Chinese
Medical University, Hangzhou 310053, China
| | - Zhibing Wu
- Department
of Radiation Oncology, Zhejiang Hospital, Hangzhou 310013, China
- Department
of Radiation Oncology, Affiliated Zhejiang
Hospital, Zhejiang University School of Medicine, Hangzhou 310013, China
| |
Collapse
|
2
|
Wu H, Min W, Gai D, Huang Z, Geng Y, Wang Q, Chen R. HD-Former: A hierarchical dependency Transformer for medical image segmentation. Comput Biol Med 2024; 178:108671. [PMID: 38870721 DOI: 10.1016/j.compbiomed.2024.108671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 04/20/2024] [Accepted: 05/26/2024] [Indexed: 06/15/2024]
Abstract
Medical image segmentation is a compelling fundamental problem and an important auxiliary tool for clinical applications. Recently, the Transformer model has emerged as a valuable tool for addressing the limitations of convolutional neural networks by effectively capturing global relationships and numerous hybrid architectures combining convolutional neural networks (CNNs) and Transformer have been devised to enhance segmentation performance. However, they suffer from multilevel semantic feature gaps and fail to account for multilevel dependencies between space and channel. In this paper, we propose a hierarchical dependency Transformer for medical image segmentation, named HD-Former. First, we utilize a Compressed Bottleneck (CB) module to enrich shallow features and localize the target region. We then introduce the Dual Cross Attention Transformer (DCAT) module to fuse multilevel features and bridge the feature gap. In addition, we design the broad exploration network (BEN) that cascades convolution and self-attention from different percepts to capture hierarchical dense contextual semantic features locally and globally. Finally, we exploit uncertain multitask edge loss to adaptively map predictions to a consistent feature space, which can optimize segmentation edges. The extensive experiments conducted on medical image segmentation from ISIC, LiTS, Kvasir-SEG, and CVC-ClinicDB datasets demonstrate that our HD-Former surpasses the state-of-the-art methods in terms of both subjective visual performance and objective evaluation. Code: https://github.com/barcelonacontrol/HD-Former.
Collapse
Affiliation(s)
- Haifan Wu
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China.
| | - Weidong Min
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China; Institute of Metaverse, Nanchang University, Nanchang, 330031, China; Jiangxi Key Laboratory of Virtual Reality, Nanchang, 330031, China.
| | - Di Gai
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China; Institute of Metaverse, Nanchang University, Nanchang, 330031, China; Jiangxi Key Laboratory of Virtual Reality, Nanchang, 330031, China.
| | - Zheng Huang
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China.
| | - Yuhan Geng
- School of Public Health, University of Michigan, Ann Arbor, MI, 48105, USA.
| | - Qi Wang
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China; Institute of Metaverse, Nanchang University, Nanchang, 330031, China; Jiangxi Key Laboratory of Virtual Reality, Nanchang, 330031, China.
| | - Ruibin Chen
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China; Information Department, The First Affiliated Hospital of Nanchang University, Nanchang, 330096, China.
| |
Collapse
|
3
|
Hsiao CH, Lin FYS, Sun TL, Liao YY, Wu CH, Lai YC, Wu HP, Liu PR, Xiao BR, Chen CH, Huang Y. Precision and Robust Models on Healthcare Institution Federated Learning for Predicting HCC on Portal Venous CT Images. IEEE J Biomed Health Inform 2024; 28:4674-4687. [PMID: 38739503 DOI: 10.1109/jbhi.2024.3400599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Hepatocellular carcinoma (HCC), the most common type of liver cancer, poses significant challenges in detection and diagnosis. Medical imaging, especially computed tomography (CT), is pivotal in non-invasively identifying this disease, requiring substantial expertise for interpretation. This research introduces an innovative strategy that integrates two-dimensional (2D) and three-dimensional (3D) deep learning models within a federated learning (FL) framework for precise segmentation of liver and tumor regions in medical images. The study utilized 131 CT scans from the Liver Tumor Segmentation (LiTS) challenge and demonstrated the superior efficiency and accuracy of the proposed Hybrid-ResUNet model with a Dice score of 0.9433 and an AUC of 0.9965 compared to ResNet and EfficientNet models. This FL approach is beneficial for conducting large-scale clinical trials while safeguarding patient privacy across healthcare settings. It facilitates active engagement in problem-solving, data collection, model development, and refinement. The study also addresses data imbalances in the FL context, showing resilience and highlighting local models' robust performance. Future research will concentrate on refining federated learning algorithms and their incorporation into the continuous implementation and deployment (CI/CD) processes in AI system operations, emphasizing the dynamic involvement of clients. We recommend a collaborative human-AI endeavor to enhance feature extraction and knowledge transfer. These improvements are intended to boost equitable and efficient data collaboration across various sectors in practical scenarios, offering a crucial guide for forthcoming research in medical AI.
Collapse
|
4
|
Jiao C, Lao Y, Zhang W, Braunstein S, Salans M, Villanueva-Meyer JE, Hervey-Jumper SL, Yang B, Morin O, Valdes G, Fan Z, Shiroishi M, Zada G, Sheng K, Yang W. Multi-modal fusion and feature enhancement U-Net coupling with stem cell niches proximity estimation for voxel-wise GBM recurrence prediction . Phys Med Biol 2024; 69:10.1088/1361-6560/ad64b8. [PMID: 39019073 PMCID: PMC11308744 DOI: 10.1088/1361-6560/ad64b8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Accepted: 07/17/2024] [Indexed: 07/19/2024]
Abstract
Objective.We aim to develop a Multi-modal Fusion and Feature Enhancement U-Net (MFFE U-Net) coupling with stem cell niche proximity estimation to improve voxel-wise Glioblastoma (GBM) recurrence prediction.Approach.57 patients with pre- and post-surgery magnetic resonance (MR) scans were retrospectively solicited from 4 databases. Post-surgery MR scans included two months before the clinical diagnosis of recurrence and the day of the radiologicaly confirmed recurrence. The recurrences were manually annotated on the T1ce. The high-risk recurrence region was first determined. Then, a sparse multi-modal feature fusion U-Net was developed. The 50 patients from 3 databases were divided into 70% training, 10% validation, and 20% testing. 7 patients from the 4th institution were used as external testing with transfer learning. Model performance was evaluated by recall, precision, F1-score, and Hausdorff Distance at the 95% percentile (HD95). The proposed MFFE U-Net was compared to the support vector machine (SVM) model and two state-of-the-art neural networks. An ablation study was performed.Main results.The MFFE U-Net achieved a precision of 0.79 ± 0.08, a recall of 0.85 ± 0.11, and an F1-score of 0.82 ± 0.09. Statistically significant improvement was observed when comparing MFFE U-Net with proximity estimation couple SVM (SVMPE), mU-Net, and Deeplabv3. The HD95 was 2.75 ± 0.44 mm and 3.91 ± 0.83 mm for the 10 patients used in the model construction and 7 patients used for external testing, respectively. The ablation test showed that all five MR sequences contributed to the performance of the final model, with T1ce contributing the most. Convergence analysis, time efficiency analysis, and visualization of the intermediate results further discovered the characteristics of the proposed method.Significance. We present an advanced MFFE learning framework, MFFE U-Net, for effective voxel-wise GBM recurrence prediction. MFFE U-Net performs significantly better than the state-of-the-art networks and can potentially guide early RT intervention of the disease recurrence.
Collapse
Affiliation(s)
- Changzhe Jiao
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143
| | - Yi Lao
- Department of Radiation Oncology, UC Los Angeles, Los Angeles, CA 90095
| | - Wenwen Zhang
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143
| | - Steve Braunstein
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143
| | - Mia Salans
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143
| | | | | | - Bo Yang
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143
| | - Olivier Morin
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143
| | - Gilmer Valdes
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143
| | - Zhaoyang Fan
- Department of Radiology, University of Southern California, Los Angeles, CA 90033
| | - Mark Shiroishi
- Department of Radiology, University of Southern California, Los Angeles, CA 90033
| | - Gabriel Zada
- Department of Neurosurgery, University of Southern California, Los Angeles, CA 90033
| | - Ke Sheng
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143
| | - Wensha Yang
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143
| |
Collapse
|
5
|
Kumar K, Yeo AU, McIntosh L, Kron T, Wheeler G, Franich RD. Deep Learning Auto-Segmentation Network for Pediatric Computed Tomography Data Sets: Can We Extrapolate From Adults? Int J Radiat Oncol Biol Phys 2024; 119:1297-1306. [PMID: 38246249 DOI: 10.1016/j.ijrobp.2024.01.201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Revised: 12/10/2023] [Accepted: 01/07/2024] [Indexed: 01/23/2024]
Abstract
PURPOSE Artificial intelligence (AI)-based auto-segmentation models hold promise for enhanced efficiency and consistency in organ contouring for adaptive radiation therapy and radiation therapy planning. However, their performance on pediatric computed tomography (CT) data and cross-scanner compatibility remain unclear. This study aimed to evaluate the performance of AI-based auto-segmentation models trained on adult CT data when applied to pediatric data sets and explore the improvement in performance gained by including pediatric training data. It also examined their ability to accurately segment CT data acquired from different scanners. METHODS AND MATERIALS Using the nnU-Net framework, segmentation models were trained on data sets of adult, pediatric, and combined CT scans for 7 pelvic/thoracic organs. Each model was trained on 290 to 300 cases per category and organ. Training data sets included a combination of clinical data and several open repositories. The study incorporated a database of 459 pediatric (0-16 years) CT scans and 950 adults (>18 years), ensuring all scans had human expert ground-truth contours of the selected organs. Performance was evaluated based on Dice similarity coefficients (DSC) of the model-generated contours. RESULTS AI models trained exclusively on adult data underperformed on pediatric data, especially for the 0 to 2 age group: mean DSC was below 0.5 for the bladder and spleen. The addition of pediatric training data demonstrated significant improvement for all age groups, achieving a mean DSC of above 0.85 for all organs in every age group. Larger organs like the liver and kidneys maintained consistent performance for all models across age groups. No significant difference emerged in the cross-scanner performance evaluation, suggesting robust cross-scanner generalization. CONCLUSIONS For optimal segmentation across age groups, it is important to include pediatric data in the training of segmentation models. The successful cross-scanner generalization also supports the real-world clinical applicability of these AI models. This study emphasizes the significance of data set diversity in training robust AI systems for medical image interpretation tasks.
Collapse
Affiliation(s)
- Kartik Kumar
- Physical Sciences Department, Peter MacCallum Cancer Centre, Victoria, Australia; School of Science, RMIT University, Melbourne, Victoria, Australia
| | - Adam U Yeo
- Physical Sciences Department, Peter MacCallum Cancer Centre, Victoria, Australia; School of Science, RMIT University, Melbourne, Victoria, Australia; Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria, Australia
| | - Lachlan McIntosh
- Physical Sciences Department, Peter MacCallum Cancer Centre, Victoria, Australia; School of Science, RMIT University, Melbourne, Victoria, Australia
| | - Tomas Kron
- Physical Sciences Department, Peter MacCallum Cancer Centre, Victoria, Australia; School of Science, RMIT University, Melbourne, Victoria, Australia; Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria, Australia; Centre for Medical Radiation Physics, University of Wollongong, Wollongong, New South Wales, Australia
| | - Greg Wheeler
- Physical Sciences Department, Peter MacCallum Cancer Centre, Victoria, Australia; Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria, Australia
| | - Rick D Franich
- Physical Sciences Department, Peter MacCallum Cancer Centre, Victoria, Australia; School of Science, RMIT University, Melbourne, Victoria, Australia.
| |
Collapse
|
6
|
Lin H, Zhao M, Zhu L, Pei X, Wu H, Zhang L, Li Y. Gaussian filter facilitated deep learning-based architecture for accurate and efficient liver tumor segmentation for radiation therapy. Front Oncol 2024; 14:1423774. [PMID: 38966060 PMCID: PMC11222586 DOI: 10.3389/fonc.2024.1423774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Accepted: 06/06/2024] [Indexed: 07/06/2024] Open
Abstract
Purpose Addressing the challenges of unclear tumor boundaries and the confusion between cysts and tumors in liver tumor segmentation, this study aims to develop an auto-segmentation method utilizing Gaussian filter with the nnUNet architecture to effectively distinguish between tumors and cysts, enhancing the accuracy of liver tumor auto-segmentation. Methods Firstly, 130 cases of liver tumorsegmentation challenge 2017 (LiTS2017) were used for training and validating nnU-Net-based auto-segmentation model. Then, 14 cases of 3D-IRCADb dataset and 25 liver cancer cases retrospectively collected in our hospital were used for testing. The dice similarity coefficient (DSC) was used to evaluate the accuracy of auto-segmentation model by comparing with manual contours. Results The nnU-Net achieved an average DSC value of 0.86 for validation set (20 LiTS cases) and 0.82 for public testing set (14 3D-IRCADb cases). For clinical testing set, the standalone nnU-Net model achieved an average DSC value of 0.75, which increased to 0.81 after post-processing with the Gaussian filter (P<0.05), demonstrating its effectiveness in mitigating the influence of liver cysts on liver tumor segmentation. Conclusion Experiments show that Gaussian filter is beneficial to improve the accuracy of liver tumor segmentation in clinic.
Collapse
Affiliation(s)
- Hongyu Lin
- Department of Oncology, First Hospital of Hebei Medical University, Shijiazhuang, China
| | - Min Zhao
- Department of Oncology, First Hospital of Hebei Medical University, Shijiazhuang, China
| | - Lingling Zhu
- Department of Oncology, First Hospital of Hebei Medical University, Shijiazhuang, China
| | - Xi Pei
- Technology Development Department, Anhui Wisdom Technology Co., Ltd., Hefei, China
| | - Haotian Wu
- Technology Development Department, Anhui Wisdom Technology Co., Ltd., Hefei, China
| | - Lian Zhang
- Department of Oncology, First Hospital of Hebei Medical University, Shijiazhuang, China
| | - Ying Li
- Department of Oncology, First Hospital of Hebei Medical University, Shijiazhuang, China
| |
Collapse
|
7
|
Luo J, Dai P, He Z, Huang Z, Liao S, Liu K. Deep learning models for ischemic stroke lesion segmentation in medical images: A survey. Comput Biol Med 2024; 175:108509. [PMID: 38677171 DOI: 10.1016/j.compbiomed.2024.108509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 02/09/2024] [Accepted: 04/21/2024] [Indexed: 04/29/2024]
Abstract
This paper provides a comprehensive review of deep learning models for ischemic stroke lesion segmentation in medical images. Ischemic stroke is a severe neurological disease and a leading cause of death and disability worldwide. Accurate segmentation of stroke lesions in medical images such as MRI and CT scans is crucial for diagnosis, treatment planning and prognosis. This paper first introduces common imaging modalities used for stroke diagnosis, discussing their capabilities in imaging lesions at different disease stages from the acute to chronic stage. It then reviews three major public benchmark datasets for evaluating stroke segmentation algorithms: ATLAS, ISLES and AISD, highlighting their key characteristics. The paper proceeds to provide an overview of foundational deep learning architectures for medical image segmentation, including CNN-based and transformer-based models. It summarizes recent innovations in adapting these architectures to the task of stroke lesion segmentation across the three datasets, analyzing their motivations, modifications and results. A survey of loss functions and data augmentations employed for this task is also included. The paper discusses various aspects related to stroke segmentation tasks, including prior knowledge, small lesions, and multimodal fusion, and then concludes by outlining promising future research directions. Overall, this comprehensive review covers critical technical developments in the field to support continued progress in automated stroke lesion segmentation.
Collapse
Affiliation(s)
- Jialin Luo
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Peishan Dai
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China.
| | - Zhuang He
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Zhongchao Huang
- Department of Biomedical Engineering, School of Basic Medical Science, Central South University, Changsha, Hunan, China
| | - Shenghui Liao
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Kun Liu
- Brain Hospital of Hunan Province (The Second People's Hospital of Hunan Province), Changsha, Hunan, China
| |
Collapse
|
8
|
Zhou Z, Islam MT, Xing L. Multibranch CNN With MLP-Mixer-Based Feature Exploration for High-Performance Disease Diagnosis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:7351-7362. [PMID: 37028335 DOI: 10.1109/tnnls.2023.3250490] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Deep learning-based diagnosis is becoming an indispensable part of modern healthcare. For high-performance diagnosis, the optimal design of deep neural networks (DNNs) is a prerequisite. Despite its success in image analysis, existing supervised DNNs based on convolutional layers often suffer from their rudimentary feature exploration ability caused by the limited receptive field and biased feature extraction of conventional convolutional neural networks (CNNs), which compromises the network performance. Here, we propose a novel feature exploration network named manifold embedded multilayer perceptron (MLP) mixer (ME-Mixer), which utilizes both supervised and unsupervised features for disease diagnosis. In the proposed approach, a manifold embedding network is employed to extract class-discriminative features; then, two MLP-Mixer-based feature projectors are adopted to encode the extracted features with the global reception field. Our ME-Mixer network is quite general and can be added as a plugin to any existing CNN. Comprehensive evaluations on two medical datasets are performed. The results demonstrate that their approach greatly enhances the classification accuracy in comparison with different configurations of DNNs with acceptable computational complexity.
Collapse
|
9
|
Liu H, Zhou Y, Gou S, Luo Z. Tumor conspicuity enhancement-based segmentation model for liver tumor segmentation and RECIST diameter measurement in non-contrast CT images. Comput Biol Med 2024; 174:108420. [PMID: 38613896 DOI: 10.1016/j.compbiomed.2024.108420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 04/04/2024] [Accepted: 04/04/2024] [Indexed: 04/15/2024]
Abstract
BACKGROUND AND OBJECTIVE Liver tumor segmentation (LiTS) accuracy on contrast-enhanced computed tomography (CECT) images is higher than that on non-contrast computed tomography (NCCT) images. However, CECT requires contrast medium and repeated scans to obtain multiphase enhanced CT images, which is time-consuming and cost-increasing. Therefore, despite the lower accuracy of LiTS on NCCT images, which still plays an irreplaceable role in some clinical settings, such as guided brachytherapy, ablation, or evaluation of patients with renal function damage. In this study, we intend to generate enhanced high-contrast pseudo-color CT (PCCT) images to improve the accuracy of LiTS and RECIST diameter measurement on NCCT images. METHODS To generate high-contrast CT liver tumor region images, an intensity-based tumor conspicuity enhancement (ITCE) model was first developed. In the ITCE model, a pseudo color conversion function from an intensity distribution of the tumor was established, and it was applied in NCCT to generate enhanced PCCT images. Additionally, we design a tumor conspicuity enhancement-based liver tumor segmentation (TCELiTS) model, which was applied to improve the segmentation of liver tumors on NCCT images. The TCELiTS model consists of three components: an image enhancement module based on the ITCE model, a segmentation module based on a deep convolutional neural network, and an attention loss module based on restricted activation. Segmentation performance was analyzed using the Dice similarity coefficient (DSC), sensitivity, specificity, and RECIST diameter error. RESULTS To develop the deep learning model, 100 patients with histopathologically confirmed liver tumors (hepatocellular carcinoma, 64 patients; hepatic hemangioma, 36 patients) were randomly divided into a training set (75 patients) and an independent test set (25 patients). Compared with existing tumor automatic segmentation networks trained on CECT images (U-Net, nnU-Net, DeepLab-V3, Modified U-Net), the DSCs achieved on the enhanced PCCT images are both improved compared with those on NCCT images. We observe improvements of 0.696-0.713, 0.715 to 0.776, 0.748 to 0.788, and 0.733 to 0.799 in U-Net, nnU-Net, DeepLab-V3, and Modified U-Net, respectively, in terms of DSC values. In addition, an observer study including 5 doctors was conducted to compare the segmentation performance of enhanced PCCT images with that of NCCT images and showed that enhanced PCCT images are more advantageous for doctors to segment tumor regions. The results showed an accuracy improvement of approximately 3%-6%, but the time required to segment a single CT image was reduced by approximately 50 %. CONCLUSIONS Experimental results show that the ITCE model can generate high-contrast enhanced PCCT images, especially in liver regions, and the TCELiTS model can improve LiTS accuracy in NCCT images.
Collapse
Affiliation(s)
- Haofeng Liu
- School of Artificial Intelligence, Xidian University, Xi'An, 710071, China
| | - Yanyan Zhou
- Department of Interventional Radiology, Tangdu Hospital, Airforce Medical University, Xi'an, 710038, China
| | - Shuiping Gou
- School of Artificial Intelligence, Xidian University, Xi'An, 710071, China
| | - Zhonghua Luo
- Department of Interventional Radiology, Tangdu Hospital, Airforce Medical University, Xi'an, 710038, China.
| |
Collapse
|
10
|
Liu H, Yang J, Jiang C, He S, Fu Y, Zhang S, Hu X, Fang J, Ji W. S2DA-Net: Spatial and spectral-learning double-branch aggregation network for liver tumor segmentation in CT images. Comput Biol Med 2024; 174:108400. [PMID: 38613888 DOI: 10.1016/j.compbiomed.2024.108400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 03/10/2024] [Accepted: 04/01/2024] [Indexed: 04/15/2024]
Abstract
Accurate liver tumor segmentation is crucial for aiding radiologists in hepatocellular carcinoma evaluation and surgical planning. While convolutional neural networks (CNNs) have been successful in medical image segmentation, they face challenges in capturing long-term dependencies among pixels. On the other hand, Transformer-based models demand a high number of parameters and involve significant computational costs. To address these issues, we propose the Spatial and Spectral-learning Double-branched Aggregation Network (S2DA-Net) for liver tumor segmentation. S2DA-Net consists of a double-branched encoder and a decoder with a Group Multi-Head Cross-Attention Aggregation (GMCA) module, Two branches in the encoder consist of a Fourier Spectral-learning Multi-scale Fusion (FSMF) branch and a Multi-axis Aggregation Hadamard Attention (MAHA) branch. The FSMF branch employs a Fourier-based network to learn amplitude and phase information, capturing richer features and detailed information without introducing an excessive number of parameters. The FSMF branch utilizes a Fourier-based network to capture amplitude and phase information, enriching features without introducing excessive parameters. The MAHA branch incorporates spatial information, enhancing discriminative features while minimizing computational costs. In the decoding path, a GMCA module extracts local information and establishes long-term dependencies, improving localization capabilities by amalgamating features from diverse branches. Experimental results on the public LiTS2017 liver tumor datasets show that the proposed segmentation model achieves significant improvements compared to the state-of-the-art methods, obtaining dice per case (DPC) 69.4 % and global dice (DG) 80.0 % for liver tumor segmentation on the LiTS2017 dataset. Meanwhile, the pre-trained model based on the LiTS2017 datasets obtain, DPC 73.4 % and an DG 82.2 % on the 3DIRCADb dataset.
Collapse
Affiliation(s)
- Huaxiang Liu
- Department Radiology of Taizhou Hospital, Zhejiang University, Taizhou, 318000, Zhejiang, China; Institute of Intelligent Information Processing, Taizhou University, Taizhou, 318000, Zhejiang, China; Key Laboratory of Evidence-based Radiology of Taizhou, Taizhou, 317000, Zhejiang, China
| | - Jie Yang
- School of Geophysics and Measurement and Control Technology, East China University of Technology, Nanchang, 330013, China
| | - Chao Jiang
- School of Geophysics and Measurement and Control Technology, East China University of Technology, Nanchang, 330013, China
| | - Sailing He
- Department Radiology of Taizhou Hospital, Zhejiang University, Taizhou, 318000, Zhejiang, China
| | - Youyao Fu
- Institute of Intelligent Information Processing, Taizhou University, Taizhou, 318000, Zhejiang, China
| | - Shiqing Zhang
- Institute of Intelligent Information Processing, Taizhou University, Taizhou, 318000, Zhejiang, China
| | - Xudong Hu
- Key Laboratory of Evidence-based Radiology of Taizhou, Taizhou, 317000, Zhejiang, China
| | - Jiangxiong Fang
- Institute of Intelligent Information Processing, Taizhou University, Taizhou, 318000, Zhejiang, China.
| | - Wenbin Ji
- Department Radiology of Taizhou Hospital, Zhejiang University, Taizhou, 318000, Zhejiang, China; Key Laboratory of Evidence-based Radiology of Taizhou, Taizhou, 317000, Zhejiang, China.
| |
Collapse
|
11
|
Wang KN, Li SX, Bu Z, Zhao FX, Zhou GQ, Zhou SJ, Chen Y. SBCNet: Scale and Boundary Context Attention Dual-Branch Network for Liver Tumor Segmentation. IEEE J Biomed Health Inform 2024; 28:2854-2865. [PMID: 38427554 DOI: 10.1109/jbhi.2024.3370864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/03/2024]
Abstract
Automated segmentation of liver tumors in CT scans is pivotal for diagnosing and treating liver cancer, offering a valuable alternative to labor-intensive manual processes and ensuring the provision of accurate and reliable clinical assessment. However, the inherent variability of liver tumors, coupled with the challenges posed by blurred boundaries in imaging characteristics, presents a substantial obstacle to achieving their precise segmentation. In this paper, we propose a novel dual-branch liver tumor segmentation model, SBCNet, to address these challenges effectively. Specifically, our proposed method introduces a contextual encoding module, which enables a better identification of tumor variability using an advanced multi-scale adaptive kernel. Moreover, a boundary enhancement module is designed for the counterpart branch to enhance the perception of boundaries by incorporating contour learning with the Sobel operator. Finally, we propose a hybrid multi-task loss function, concurrently concerning tumors' scale and boundary features, to foster interaction across different tasks of dual branches, further improving tumor segmentation. Experimental validation on the publicly available LiTS dataset demonstrates the practical efficacy of each module, with SBCNet yielding competitive results compared to other state-of-the-art methods for liver tumor segmentation.
Collapse
|
12
|
Zhang W, Tao Y, Huang Z, Li Y, Chen Y, Song T, Ma X, Zhang Y. Multi-phase features interaction transformer network for liver tumor segmentation and microvascular invasion assessment in contrast-enhanced CT. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:5735-5761. [PMID: 38872556 DOI: 10.3934/mbe.2024253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2024]
Abstract
Precise segmentation of liver tumors from computed tomography (CT) scans is a prerequisite step in various clinical applications. Multi-phase CT imaging enhances tumor characterization, thereby assisting radiologists in accurate identification. However, existing automatic liver tumor segmentation models did not fully exploit multi-phase information and lacked the capability to capture global information. In this study, we developed a pioneering multi-phase feature interaction Transformer network (MI-TransSeg) for accurate liver tumor segmentation and a subsequent microvascular invasion (MVI) assessment in contrast-enhanced CT images. In the proposed network, an efficient multi-phase features interaction module was introduced to enable bi-directional feature interaction among multiple phases, thus maximally exploiting the available multi-phase information. To enhance the model's capability to extract global information, a hierarchical transformer-based encoder and decoder architecture was designed. Importantly, we devised a multi-resolution scales feature aggregation strategy (MSFA) to optimize the parameters and performance of the proposed model. Subsequent to segmentation, the liver tumor masks generated by MI-TransSeg were applied to extract radiomic features for the clinical applications of the MVI assessment. With Institutional Review Board (IRB) approval, a clinical multi-phase contrast-enhanced CT abdominal dataset was collected that included 164 patients with liver tumors. The experimental results demonstrated that the proposed MI-TransSeg was superior to various state-of-the-art methods. Additionally, we found that the tumor mask predicted by our method showed promising potential in the assessment of microvascular invasion. In conclusion, MI-TransSeg presents an innovative paradigm for the segmentation of complex liver tumors, thus underscoring the significance of multi-phase CT data exploitation. The proposed MI-TransSeg network has the potential to assist radiologists in diagnosing liver tumors and assessing microvascular invasion.
Collapse
Affiliation(s)
- Wencong Zhang
- Department of Biomedical Engineering, College of Engineering, Shantou University, Shantou, China
- Department of Biomedical Engineering, College of Design and Engineering, National University of Singapore, Singapore
| | - Yuxi Tao
- Department of Radiology, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
| | - Zhanyao Huang
- Department of Biomedical Engineering, College of Engineering, Shantou University, Shantou, China
| | - Yue Li
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Yingjia Chen
- Department of Biomedical Engineering, College of Engineering, Shantou University, Shantou, China
| | - Tengfei Song
- Department of Radiology, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
| | - Xiangyuan Ma
- Department of Biomedical Engineering, College of Engineering, Shantou University, Shantou, China
| | - Yaqin Zhang
- Department of Biomedical Engineering, College of Engineering, Shantou University, Shantou, China
| |
Collapse
|
13
|
Zhang K, Yang X, Cui Y, Zhao J, Li D. Imaging segmentation mechanism for rectal tumors using improved U-Net. BMC Med Imaging 2024; 24:95. [PMID: 38654162 DOI: 10.1186/s12880-024-01269-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Accepted: 04/05/2024] [Indexed: 04/25/2024] Open
Abstract
OBJECTIVE In radiation therapy, cancerous region segmentation in magnetic resonance images (MRI) is a critical step. For rectal cancer, the automatic segmentation of rectal tumors from an MRI is a great challenge. There are two main shortcomings in existing deep learning-based methods that lead to incorrect segmentation: 1) there are many organs surrounding the rectum, and the shape of some organs is similar to that of rectal tumors; 2) high-level features extracted by conventional neural networks often do not contain enough high-resolution information. Therefore, an improved U-Net segmentation network based on attention mechanisms is proposed to replace the traditional U-Net network. METHODS The overall framework of the proposed method is based on traditional U-Net. A ResNeSt module was added to extract the overall features, and a shape module was added after the encoder layer. We then combined the outputs of the shape module and the decoder to obtain the results. Moreover, the model used different types of attention mechanisms, so that the network learned information to improve segmentation accuracy. RESULTS We validated the effectiveness of the proposed method using 3773 2D MRI datasets from 304 patients. The results showed that the proposed method achieved 0.987, 0.946, 0.897, and 0.899 for Dice, MPA, MioU, and FWIoU, respectively; these values are significantly better than those of other existing methods. CONCLUSION Due to time savings, the proposed method can help radiologists segment rectal tumors effectively and enable them to focus on patients whose cancerous regions are difficult for the network to segment. SIGNIFICANCE The proposed method can help doctors segment rectal tumors, thereby ensuring good diagnostic quality and accuracy.
Collapse
Affiliation(s)
- Kenan Zhang
- College of Electronic Information and Optical Engineering, Taiyuan University of Technology, Taiyuan, 030024, China
- Key Laboratory of Big Data Fusion Analysis and Application of Shanxi Province, Taiyuan, 030024, China
| | - Xiaotang Yang
- Department of Radiology, Shanxi Cancer Hospital, Shanxi Medical University, Taiyuan, 030013, China.
| | - Yanfen Cui
- Department of Radiology, Shanxi Cancer Hospital, Shanxi Medical University, Taiyuan, 030013, China
| | - Jumin Zhao
- College of Electronic Information and Optical Engineering, Taiyuan University of Technology, Taiyuan, 030024, China
- Key Laboratory of Big Data Fusion Analysis and Application of Shanxi Province, Taiyuan, 030024, China
- Intelligent Perception Engineering Technology Center of Shanxi, Taiyuan, 030024, China
| | - Dengao Li
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, 030024, China.
- Key Laboratory of Big Data Fusion Analysis and Application of Shanxi Province, Taiyuan, 030024, China.
- Intelligent Perception Engineering Technology Center of Shanxi, Taiyuan, 030024, China.
| |
Collapse
|
14
|
Yang X, Zheng Y, Mei C, Jiang G, Tian B, Wang L. UGLS: an uncertainty guided deep learning strategy for accurate image segmentation. Front Physiol 2024; 15:1362386. [PMID: 38651048 PMCID: PMC11033460 DOI: 10.3389/fphys.2024.1362386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Accepted: 03/26/2024] [Indexed: 04/25/2024] Open
Abstract
Accurate image segmentation plays a crucial role in computer vision and medical image analysis. In this study, we developed a novel uncertainty guided deep learning strategy (UGLS) to enhance the performance of an existing neural network (i.e., U-Net) in segmenting multiple objects of interest from images with varying modalities. In the developed UGLS, a boundary uncertainty map was introduced for each object based on its coarse segmentation (obtained by the U-Net) and then combined with input images for the fine segmentation of the objects. We validated the developed method by segmenting optic cup (OC) regions from color fundus images and left and right lung regions from Xray images. Experiments on public fundus and Xray image datasets showed that the developed method achieved a average Dice Score (DS) of 0.8791 and a sensitivity (SEN) of 0.8858 for the OC segmentation, and 0.9605, 0.9607, 0.9621, and 0.9668 for the left and right lung segmentation, respectively. Our method significantly improved the segmentation performance of the U-Net, making it comparable or superior to five sophisticated networks (i.e., AU-Net, BiO-Net, AS-Net, Swin-Unet, and TransUNet).
Collapse
Affiliation(s)
- Xiaoguo Yang
- Wenzhou People’s Hospital, The Third Affiliated Hospital of Shanghai University, Wenzhou, China
| | - Yanyan Zheng
- Wenzhou People’s Hospital, The Third Affiliated Hospital of Shanghai University, Wenzhou, China
| | - Chenyang Mei
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, China
| | - Gaoqiang Jiang
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, China
| | - Bihan Tian
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, China
| | - Lei Wang
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, China
| |
Collapse
|
15
|
Zhan F, Wang W, Chen Q, Guo Y, He L, Wang L. Three-Direction Fusion for Accurate Volumetric Liver and Tumor Segmentation. IEEE J Biomed Health Inform 2024; 28:2175-2186. [PMID: 38109246 DOI: 10.1109/jbhi.2023.3344392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2023]
Abstract
Biomedical image segmentation of organs, tissues and lesions has gained increasing attention in clinical treatment planning and navigation, which involves the exploration of two-dimensional (2D) and three-dimensional (3D) contexts in the biomedical image. Compared to 2D methods, 3D methods pay more attention to inter-slice correlations, which offer additional spatial information for image segmentation. An organ or tumor has a 3D structure that can be observed from three directions. Previous studies focus only on the vertical axis, limiting the understanding of the relationship between a tumor and its surrounding tissues. Important information can also be obtained from sagittal and coronal axes. Therefore, spatial information of organs and tumors can be obtained from three directions, i.e. the sagittal, coronal and vertical axes, to understand better the invasion depth of tumor and its relationship with the surrounding tissues. Moreover, the edges of organs and tumors in biomedical image may be blurred. To address these problems, we propose a three-direction fusion volumetric segmentation (TFVS) model for segmenting 3D biomedical images from three perspectives in sagittal, coronal and transverse planes, respectively. We use the dataset of the liver task provided by the Medical Segmentation Decathlon challenge to train our model. The TFVS method demonstrates a competitive performance on the 3D-IRCADB dataset. In addition, the t-test and Wilcoxon signed-rank test are also performed to show the statistical significance of the improvement by the proposed method as compared with the baseline methods. The proposed method is expected to be beneficial in guiding and facilitating clinical diagnosis and treatment.
Collapse
|
16
|
Lin Y, Wang J, Liu Q, Zhang K, Liu M, Wang Y. CFANet: Context fusing attentional network for preoperative CT image segmentation in robotic surgery. Comput Biol Med 2024; 171:108115. [PMID: 38402837 DOI: 10.1016/j.compbiomed.2024.108115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 01/30/2024] [Accepted: 02/04/2024] [Indexed: 02/27/2024]
Abstract
Accurate segmentation of CT images is crucial for clinical diagnosis and preoperative evaluation of robotic surgery, but challenges arise from fuzzy boundaries and small-sized targets. In response, a novel 2D segmentation network named Context Fusing Attentional Network (CFANet) is proposed. CFANet incorporates three key modules to address these challenges, namely pyramid fusing module (PFM), parallel dilated convolution module (PDCM) and scale attention module (SAM). Integration of these modules into the encoder-decoder structure enables effective utilization of multi-level and multi-scale features. Compared with advanced segmentation method, the Dice score improved by 2.14% on the dataset of liver tumor. This improvement is expected to have a positive impact on the preoperative evaluation of robotic surgery and to support clinical diagnosis, especially in early tumor detection.
Collapse
Affiliation(s)
- Yao Lin
- College of Electrical and Information Engineering, Hunan University, Changsha, 410082, China; National Engineering Research Center of Robot Visual Perception and Control Technology, Hunan University, Changsha, 410082, China
| | - Jiazheng Wang
- College of Electrical and Information Engineering, Hunan University, Changsha, 410082, China; National Engineering Research Center of Robot Visual Perception and Control Technology, Hunan University, Changsha, 410082, China.
| | - Qinghao Liu
- College of Electrical and Information Engineering, Hunan University, Changsha, 410082, China; National Engineering Research Center of Robot Visual Perception and Control Technology, Hunan University, Changsha, 410082, China
| | - Kang Zhang
- College of Electrical and Information Engineering, Hunan University, Changsha, 410082, China; National Engineering Research Center of Robot Visual Perception and Control Technology, Hunan University, Changsha, 410082, China
| | - Min Liu
- College of Electrical and Information Engineering, Hunan University, Changsha, 410082, China; National Engineering Research Center of Robot Visual Perception and Control Technology, Hunan University, Changsha, 410082, China; Research Institute of Hunan University in Chongqing, Chongqing, 401135, China.
| | - Yaonan Wang
- College of Electrical and Information Engineering, Hunan University, Changsha, 410082, China; National Engineering Research Center of Robot Visual Perception and Control Technology, Hunan University, Changsha, 410082, China
| |
Collapse
|
17
|
Ling Y, Wang Y, Liu Q, Yu J, Xu L, Zhang X, Liang P, Kong D. EPolar-UNet: An edge-attending polar UNet for automatic medical image segmentation with small datasets. Med Phys 2024; 51:1702-1713. [PMID: 38299370 DOI: 10.1002/mp.16957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Revised: 12/29/2023] [Accepted: 01/14/2024] [Indexed: 02/02/2024] Open
Abstract
BACKGROUND Medical image segmentation is one of the most key steps in computer-aided clinical diagnosis, geometric characterization, measurement, image registration, and so forth. Convolutional neural networks especially UNet and its variants have been successfully used in many medical image segmentation tasks. However, the results are limited by the deficiency in extracting high resolution edge information because of the design of the skip connections in UNet and the need for large available datasets. PURPOSE In this paper, we proposed an edge-attending polar UNet (EPolar-UNet), which was trained on the polar coordinate system instead of classic Cartesian coordinate system with an edge-attending construction in skip connection path. METHODS EPolar-UNet extracted the location information from an eight-stacked hourglass network as the pole for polar transformation and extracted the boundary cues from an edge-attending UNet, which consisted of a deconvolution layer and a subtraction operation. RESULTS We evaluated the performance of EPolar-UNet across three imaging modalities for different segmentation tasks: CVC-ClinicDB dataset for polyp, ISIC-2018 dataset for skin lesion, and our private ultrasound dataset for liver tumor segmentation. Our proposed model outperformed state-of-the-art models on all three datasets and needed only 30%-60% of training data compared with the benchmark UNet model to achieve similar performances for medical image segmentation tasks. CONCLUSIONS We proposed an end-to-end EPolar-UNet for automatic medical image segmentation and showed good performance on small datasets, which was critical in the field of medical image segmentation.
Collapse
Affiliation(s)
- Yating Ling
- School of Mathematical Sciences, Zhejiang University, Hangzhou, China
| | - Yuling Wang
- Department of Interventional Ultrasound, The Fifth Medical Center of Chinese PLA General Hospital, Beijing, China
| | - Qian Liu
- School of Mathematical Sciences, Zhejiang University, Hangzhou, China
| | - Jie Yu
- Department of Interventional Ultrasound, The Fifth Medical Center of Chinese PLA General Hospital, Beijing, China
| | - Lei Xu
- Zhejiang Qiushi Institute for Mathematical Medicine, Hangzhou, China
| | - Xiaoqian Zhang
- School of Mathematical Sciences, Zhejiang University, Hangzhou, China
| | - Ping Liang
- Department of Interventional Ultrasound, The Fifth Medical Center of Chinese PLA General Hospital, Beijing, China
| | - Dexing Kong
- School of Mathematical Sciences, Zhejiang University, Hangzhou, China
| |
Collapse
|
18
|
Shi J, Wang Z, Ruan S, Zhao M, Zhu Z, Kan H, An H, Xue X, Yan B. Rethinking automatic segmentation of gross target volume from a decoupling perspective. Comput Med Imaging Graph 2024; 112:102323. [PMID: 38171254 DOI: 10.1016/j.compmedimag.2023.102323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 10/19/2023] [Accepted: 12/12/2023] [Indexed: 01/05/2024]
Abstract
Accurate and reliable segmentation of Gross Target Volume (GTV) is critical in cancer Radiation Therapy (RT) planning, but manual delineation is time-consuming and subject to inter-observer variations. Recently, deep learning methods have achieved remarkable success in medical image segmentation. However, due to the low image contrast and extreme pixel imbalance between GTV and adjacent tissues, most existing methods usually obtained limited performance on automatic GTV segmentation. In this paper, we propose a Heterogeneous Cascade Framework (HCF) from a decoupling perspective, which decomposes the GTV segmentation into independent recognition and segmentation subtasks. The former aims to screen out the abnormal slices containing GTV, while the latter performs pixel-wise segmentation of these slices. With the decoupled two-stage framework, we can efficiently filter normal slices to reduce false positives. To further improve the segmentation performance, we design a multi-level Spatial Alignment Network (SANet) based on the feature pyramid structure, which introduces a spatial alignment module into the decoder to compensate for the information loss caused by downsampling. Moreover, we propose a Combined Regularization (CR) loss and Balance-Sampling Strategy (BSS) to alleviate the pixel imbalance problem and improve network convergence. Extensive experiments on two public datasets of StructSeg2019 challenge demonstrate that our method outperforms state-of-the-art methods, especially with significant advantages in reducing false positives and accurately segmenting small objects. The code is available at https://github.com/shijun18/GTV_AutoSeg.
Collapse
Affiliation(s)
- Jun Shi
- School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026, China.
| | - Zhaohui Wang
- School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026, China.
| | - Shulan Ruan
- School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026, China.
| | - Minfan Zhao
- School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026, China.
| | - Ziqi Zhu
- School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026, China.
| | - Hongyu Kan
- School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026, China.
| | - Hong An
- School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026, China; Laoshan Laboratory Qingdao, Qindao, 266221, China.
| | - Xudong Xue
- Hubei Cancer Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Bing Yan
- Department of radiation oncology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230001, China.
| |
Collapse
|
19
|
Liu Z, Hou J, Pan X, Zhang R, Shi Z. PA-Net: A phase attention network fusing venous and arterial phase features of CT images for liver tumor segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 244:107997. [PMID: 38176329 DOI: 10.1016/j.cmpb.2023.107997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 12/15/2023] [Accepted: 12/25/2023] [Indexed: 01/06/2024]
Abstract
BACKGROUND AND OBJECTIVE Liver cancer seriously threatens human health. In clinical diagnosis, contrast-enhanced computed tomography (CECT) images provide important supplementary information for accurate liver tumor segmentation. However, most of the existing methods of liver tumor automatic segmentation focus only on single-phase image features. And the existing multi-modal methods have limited segmentation effect due to the redundancy of fusion features. In addition, the spatial misalignment of multi-phase images causes feature interference. METHODS In this paper, we propose a phase attention network (PA-Net) to adequately aggregate multi-phase information of CT images and improve segmentation performance for liver tumors. Specifically, we design a PA module to generate attention weight maps voxel by voxel to efficiently fuse multi-phase CT images features to avoid feature redundancy. In order to solve the problem of feature interference in the multi-phase image segmentation task, we design a new learning strategy and prove its effectiveness experimentally. RESULTS We conduct comparative experiments on the in-house clinical dataset and achieve the SOTA segmentation performance on multi-phase methods. In addition, our method has improved the mean dice score by 3.3% compared with the single-phase method based on nnUNet, and our learning strategy has improved the mean dice score by 1.51% compared with the ML strategy. CONCLUSION The experimental results show that our method is superior to the existing multi-phase liver tumor segmentation method, and provides a scheme for dealing with missing modalities in multi-modal tasks. In addition, our proposed learning strategy makes more effective use of arterial phase image information and is proven to be the most effective in liver tumor segmentation tasks using thick-layer CT images. The source code is released on (https://github.com/Houjunfeng203934/PA-Net).
Collapse
Affiliation(s)
- Zhenbing Liu
- School of Computer and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
| | - Junfeng Hou
- School of Computer and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
| | - Xipeng Pan
- School of Computer and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
| | - Ruojie Zhang
- The Second Affiliated Hospital of Guangxi Medical University, Nanning 530007, China
| | - Zhenwei Shi
- Department of Radiology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou 510080, China; Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangzhou 510080, China.
| |
Collapse
|
20
|
Seo H, Lee S, Yun S, Leem S, So S, Han DH. RenseNet: A Deep Learning Network Incorporating Residual and Dense Blocks with Edge Conservative Module to Improve Small-Lesion Classification and Model Interpretation. Cancers (Basel) 2024; 16:570. [PMID: 38339320 PMCID: PMC10854971 DOI: 10.3390/cancers16030570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 01/16/2024] [Accepted: 01/27/2024] [Indexed: 02/12/2024] Open
Abstract
Deep learning has become an essential tool in medical image analysis owing to its remarkable performance. Target classification and model interpretability are key applications of deep learning in medical image analysis, and hence many deep learning-based algorithms have emerged. Many existing deep learning-based algorithms include pooling operations, which are a type of subsampling used to enlarge the receptive field. However, pooling operations degrade the image details in terms of signal processing theory, which is significantly sensitive to small objects in an image. Therefore, in this study, we designed a Rense block and edge conservative module to effectively manipulate previous feature information in the feed-forward learning process. Specifically, a Rense block, an optimal design that incorporates skip connections of residual and dense blocks, was demonstrated through mathematical analysis. Furthermore, we avoid blurring of the features in the pooling operation through a compensation path in the edge conservative module. Two independent CT datasets of kidney stones and lung tumors, in which small lesions are often included in the images, were used to verify the proposed RenseNet. The results of the classification and explanation heatmaps show that the proposed RenseNet provides the best inference and interpretation compared to current state-of-the-art methods. The proposed RenseNet can significantly contribute to efficient diagnosis and treatment because it is effective for small lesions that might be misclassified or misinterpreted.
Collapse
Affiliation(s)
- Hyunseok Seo
- Bionics Research Center, Biomedical Research Division, Korea Institute of Science and Technology (KIST), Seoul 02792, Republic of Korea; (S.L.); (S.Y.); (S.L.); (S.S.)
| | - Seokjun Lee
- Bionics Research Center, Biomedical Research Division, Korea Institute of Science and Technology (KIST), Seoul 02792, Republic of Korea; (S.L.); (S.Y.); (S.L.); (S.S.)
| | - Sojin Yun
- Bionics Research Center, Biomedical Research Division, Korea Institute of Science and Technology (KIST), Seoul 02792, Republic of Korea; (S.L.); (S.Y.); (S.L.); (S.S.)
| | - Saebom Leem
- Bionics Research Center, Biomedical Research Division, Korea Institute of Science and Technology (KIST), Seoul 02792, Republic of Korea; (S.L.); (S.Y.); (S.L.); (S.S.)
| | - Seohee So
- Bionics Research Center, Biomedical Research Division, Korea Institute of Science and Technology (KIST), Seoul 02792, Republic of Korea; (S.L.); (S.Y.); (S.L.); (S.S.)
| | - Deok Hyun Han
- Department of Urology, Samsung Medical Center (SMC), Seoul 06351, Republic of Korea;
| |
Collapse
|
21
|
Liu L, Wu K, Wang K, Han Z, Qiu J, Zhan Q, Wu T, Xu J, Zeng Z. SEU 2-Net: multi-scale U 2-Net with SE attention mechanism for liver occupying lesion CT image segmentation. PeerJ Comput Sci 2024; 10:e1751. [PMID: 38435550 PMCID: PMC10909188 DOI: 10.7717/peerj-cs.1751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 11/22/2023] [Indexed: 03/05/2024]
Abstract
Liver occupying lesions can profoundly impact an individual's health and well-being. To assist physicians in the diagnosis and treatment of abnormal areas in the liver, we propose a novel network named SEU2-Net by introducing the channel attention mechanism into U2-Net for accurate and automatic liver occupying lesion segmentation. We design the Residual U-block with Squeeze-and-Excitation (SE-RSU), which is to add the Squeeze-and-Excitation (SE) attention mechanism at the residual connections of the Residual U-blocks (RSU, the component unit of U2-Net). SEU2-Net not only retains the advantages of U2-Net in capturing contextual information at multiple scales, but can also adaptively recalibrate channel feature responses to emphasize useful feature information according to the channel attention mechanism. In addition, we present a new abdominal CT dataset for liver occupying lesion segmentation from Peking University First Hospital's clinical data (PUFH dataset). We evaluate the proposed method and compare it with eight deep learning networks on the PUFH and the Liver Tumor Segmentation Challenge (LiTS) datasets. The experimental results show that SEU2-Net has state-of-the-art performance and good robustness in liver occupying lesions segmentation.
Collapse
Affiliation(s)
- Lizhuang Liu
- Shanghai Advanced Research Institute, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai, China
| | - Kun Wu
- Shanghai Advanced Research Institute, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai, China
| | - Ke Wang
- Radiology Department, Peking University First Hospital, Beijing, China
| | - Zhenqi Han
- Shanghai Advanced Research Institute, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai, China
| | - Jianxing Qiu
- Radiology Department, Peking University First Hospital, Beijing, China
| | - Qiao Zhan
- Department of Infectious Diseases, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Tian Wu
- Department of Infectious Diseases, Peking University First Hospital, Beijing, China
| | - Jinghang Xu
- Department of Infectious Diseases, Peking University First Hospital, Beijing, China
| | - Zheng Zeng
- Department of Infectious Diseases, Peking University First Hospital, Beijing, China
| |
Collapse
|
22
|
Yang S, Liang Y, Wu S, Sun P, Chen Z. SADSNet: A robust 3D synchronous segmentation network for liver and liver tumors based on spatial attention mechanism and deep supervision. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2024; 32:707-723. [PMID: 38552134 DOI: 10.3233/xst-230312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Highlights • Introduce a data augmentation strategy to expand the required different morphological data during the training and learning phase, and improve the algorithm's feature learning ability for complex and diverse tumor morphology CT images.• Design attention mechanisms for encoding and decoding paths to extract fine pixel level features, improve feature extraction capabilities, and achieve efficient spatial channel feature fusion.• The deep supervision layer is used to correct and decode the final image data to provide high accuracy of results.• The effectiveness of this method has been affirmed through validation on the LITS, 3DIRCADb, and SLIVER datasets. BACKGROUND Accurately extracting liver and liver tumors from medical images is an important step in lesion localization and diagnosis, surgical planning, and postoperative monitoring. However, the limited number of radiation therapists and a great number of images make this work time-consuming. OBJECTIVE This study designs a spatial attention deep supervised network (SADSNet) for simultaneous automatic segmentation of liver and tumors. METHOD Firstly, self-designed spatial attention modules are introduced at each layer of the encoder and decoder to extract image features at different scales and resolutions, helping the model better capture liver tumors and fine structures. The designed spatial attention module is implemented through two gate signals related to liver and tumors, as well as changing the size of convolutional kernels; Secondly, deep supervision is added behind the three layers of the decoder to assist the backbone network in feature learning and improve gradient propagation, enhancing robustness. RESULTS The method was testing on LITS, 3DIRCADb, and SLIVER datasets. For the liver, it obtained dice similarity coefficients of 97.03%, 96.11%, and 97.40%, surface dice of 81.98%, 82.53%, and 86.29%, 95% hausdorff distances of 8.96 mm, 8.26 mm, and 3.79 mm, and average surface distances of 1.54 mm, 1.19 mm, and 0.81 mm. Additionally, it also achieved precise tumor segmentation, which with dice scores of 87.81% and 87.50%, surface dice of 89.63% and 84.26%, 95% hausdorff distance of 12.96 mm and 16.55 mm, and average surface distances of 1.11 mm and 3.04 mm on LITS and 3DIRCADb, respectively. CONCLUSION The experimental results show that the proposed method is effective and superior to some other methods. Therefore, this method can provide technical support for liver and liver tumor segmentation in clinical practice.
Collapse
Affiliation(s)
- Sijing Yang
- School of Life and Environmental Science, Guilin University of Electronic Technology, Guilin, China
| | - Yongbo Liang
- School of Life and Environmental Science, Guilin University of Electronic Technology, Guilin, China
| | - Shang Wu
- School of Life and Environmental Science, Guilin University of Electronic Technology, Guilin, China
| | - Peng Sun
- School of Electronic Engineering and Automation, Guilin University of Electronic Technology, Guilin, China
| | - Zhencheng Chen
- School of Life and Environmental Science, Guilin University of Electronic Technology, Guilin, China
- School of Electronic Engineering and Automation, Guilin University of Electronic Technology, Guilin, China
- Guangxi Colleges and Universities Key Laboratory of Biomedical Sensors and Intelligent Instruments, Guilin, China
- Guangxi Engineering Technology Research Center of Human Physiological Information Noninvasive Detection, Guilin, China
| |
Collapse
|
23
|
Xie Y, Zhang J, Xia Y, Shen C. Learning From Partially Labeled Data for Multi-Organ and Tumor Segmentation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:14905-14919. [PMID: 37672381 DOI: 10.1109/tpami.2023.3312587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
Medical image benchmarks for the segmentation of organs and tumors suffer from the partially labeling issue due to its intensive cost of labor and expertise. Current mainstream approaches follow the practice of one network solving one task. With this pipeline, not only the performance is limited by the typically small dataset of a single task, but also the computation cost linearly increases with the number of tasks. To address this, we propose a Transformer based dynamic on-demand network (TransDoDNet) that learns to segment organs and tumors on multiple partially labeled datasets. Specifically, TransDoDNet has a hybrid backbone that is composed of the convolutional neural network and Transformer. A dynamic head enables the network to accomplish multiple segmentation tasks flexibly. Unlike existing approaches that fix kernels after training, the kernels in the dynamic head are generated adaptively by the Transformer, which employs the self-attention mechanism to model long-range organ-wise dependencies and decodes the organ embedding that can represent each organ. We create a large-scale partially labeled Multi-Organ and Tumor Segmentation benchmark, termed MOTS, and demonstrate the superior performance of our TransDoDNet over other competitors on seven organ and tumor segmentation tasks. This study also provides a general 3D medical image segmentation model, which has been pre-trained on the large-scale MOTS benchmark and has demonstrated advanced performance over current predominant self-supervised learning methods.
Collapse
|
24
|
Ikuta M, Zhang J. A Deep Convolutional Gated Recurrent Unit for CT Image Reconstruction. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:10612-10625. [PMID: 35522637 DOI: 10.1109/tnnls.2022.3169569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Computed tomography (CT) is one of the most important medical imaging technologies in use today. Most commercial CT products use a technique known as the filtered backprojection (FBP) that is fast and can produce decent image quality when an X-ray dose is high. However, the FBP is not good enough on low-dose X-ray CT imaging because the CT image reconstruction problem becomes more stochastic. A more effective reconstruction technique proposed recently and implemented in a limited number of CT commercial products is an iterative reconstruction (IR). The IR technique is based on a Bayesian formulation of the CT image reconstruction problem with an explicit model of the CT scanning, including its stochastic nature, and a prior model that incorporates our knowledge about what a good CT image should look like. However, constructing such prior knowledge is more complicated than it seems. In this article, we propose a novel neural network for CT image reconstruction. The network is based on the IR formulation and constructed with a recurrent neural network (RNN). Specifically, we transform the gated recurrent unit (GRU) into a neural network performing CT image reconstruction. We call it "GRU reconstruction." This neural network conducts concurrent dual-domain learning. Many deep learning (DL)-based methods in medical imaging are single-domain learning, but dual-domain learning performs better because it learns from both the sinogram and the image domain. In addition, we propose backpropagation through stage (BPTS) as a new RNN backpropagation algorithm. It is similar to the backpropagation through time (BPTT) of an RNN; however, it is tailored for iterative optimization. Results from extensive experiments indicate that our proposed method outperforms conventional model-based methods, single-domain DL methods, and state-of-the-art DL techniques in terms of the root mean squared error (RMSE), the peak signal-to-noise ratio (PSNR), and the structure similarity (SSIM) and in terms of visual appearance.
Collapse
|
25
|
Chen Z, Hua S, Gao J, Chen Y, Gong Y, Shen Y, Tang X, Emu Y, Jin W, Hu C. A dual-stage partially interpretable neural network for joint suppression of bSSFP banding and flow artifacts in non-phase-cycled cine imaging. J Cardiovasc Magn Reson 2023; 25:68. [PMID: 37993824 PMCID: PMC10666342 DOI: 10.1186/s12968-023-00988-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2023] [Accepted: 11/12/2023] [Indexed: 11/24/2023] Open
Abstract
PURPOSE To develop a partially interpretable neural network for joint suppression of banding and flow artifacts in non-phase-cycled bSSFP cine imaging. METHODS A dual-stage neural network consisting of a voxel-identification (VI) sub-network and artifact-suppression (AS) sub-network is proposed. The VI sub-network provides identification of artifacts, which guides artifact suppression and improves interpretability. The AS sub-network reduces banding and flow artifacts. Short-axis cine images of 12 frequency offsets from 28 healthy subjects were used to train and test the dual-stage network. An additional 77 patients were retrospectively enrolled to evaluate its clinical generalizability. For healthy subjects, artifact suppression performance was analyzed by comparison with traditional phase cycling. The partial interpretability provided by the VI sub-network was analyzed via correlation analysis. Generalizability was evaluated for cine obtained with different sequence parameters and scanners. For patients, artifact suppression performance and partial interpretability of the network were qualitatively evaluated by 3 clinicians. Cardiac function before and after artifact suppression was assessed via left ventricular ejection fraction (LVEF). RESULTS For the healthy subjects, visual inspection and quantitative analysis found a considerable reduction of banding and flow artifacts by the proposed network. Compared with traditional phase cycling, the proposed network improved flow artifact scores (4.57 ± 0.23 vs 3.40 ± 0.38, P = 0.002) and overall image quality (4.33 ± 0.22 vs 3.60 ± 0.38, P = 0.002). The VI sub-network well identified the location of banding and flow artifacts in the original movie and significantly correlated with the change of signal intensities in these regions. Changes of imaging parameters or the scanner did not cause a significant change of overall image quality relative to the baseline dataset, suggesting a good generalizability. For the patients, qualitative analysis showed a significant improvement of banding artifacts (4.01 ± 0.50 vs 2.77 ± 0.40, P < 0.001), flow artifacts (4.22 ± 0.38 vs 2.97 ± 0.57, P < 0.001), and image quality (3.91 ± 0.45 vs 2.60 ± 0.43, P < 0.001) relative to the original cine. The artifact suppression slightly reduced the LVEF (mean bias = -1.25%, P = 0.01). CONCLUSIONS The dual-stage network simultaneously reduces banding and flow artifacts in bSSFP cine imaging with a partial interpretability, sparing the need for sequence modification. The method can be easily deployed in a clinical setting to identify artifacts and improve cine image quality.
Collapse
Affiliation(s)
- Zhuo Chen
- National Engineering Research Center of Advanced Magnetic Resonance Technologies for Diagnosis and Therapy, School of Biomedical Engineering, Shanghai Jiao Tong University, 415 S Med-X Center, 1954 Huashan Road, Shanghai, 200030, China
| | - Sha Hua
- Department of Cardiovascular Medicine, Heart Failure Center, Ruijin Hospital Lu Wan Branch, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Juan Gao
- National Engineering Research Center of Advanced Magnetic Resonance Technologies for Diagnosis and Therapy, School of Biomedical Engineering, Shanghai Jiao Tong University, 415 S Med-X Center, 1954 Huashan Road, Shanghai, 200030, China
| | - Yanjia Chen
- Department of Cardiovascular Medicine, Heart Failure Center, Ruijin Hospital Lu Wan Branch, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Yiwen Gong
- Department of Cardiovascular Medicine, Heart Failure Center, Ruijin Hospital Lu Wan Branch, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Yiwen Shen
- Department of Cardiovascular Medicine, Heart Failure Center, Ruijin Hospital Lu Wan Branch, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Xin Tang
- National Engineering Research Center of Advanced Magnetic Resonance Technologies for Diagnosis and Therapy, School of Biomedical Engineering, Shanghai Jiao Tong University, 415 S Med-X Center, 1954 Huashan Road, Shanghai, 200030, China
| | - Yixin Emu
- National Engineering Research Center of Advanced Magnetic Resonance Technologies for Diagnosis and Therapy, School of Biomedical Engineering, Shanghai Jiao Tong University, 415 S Med-X Center, 1954 Huashan Road, Shanghai, 200030, China
| | - Wei Jin
- Department of Cardiovascular Medicine, Heart Failure Center, Ruijin Hospital Lu Wan Branch, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Chenxi Hu
- National Engineering Research Center of Advanced Magnetic Resonance Technologies for Diagnosis and Therapy, School of Biomedical Engineering, Shanghai Jiao Tong University, 415 S Med-X Center, 1954 Huashan Road, Shanghai, 200030, China.
| |
Collapse
|
26
|
Chen Y, Yu L, Wang JY, Panjwani N, Obeid JP, Liu W, Liu L, Kovalchuk N, Gensheimer MF, Vitzthum LK, Beadle BM, Chang DT, Le QT, Han B, Xing L. Adaptive Region-Specific Loss for Improved Medical Image Segmentation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:13408-13421. [PMID: 37363838 PMCID: PMC11346301 DOI: 10.1109/tpami.2023.3289667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/28/2023]
Abstract
Defining the loss function is an important part of neural network design and critically determines the success of deep learning modeling. A significant shortcoming of the conventional loss functions is that they weight all regions in the input image volume equally, despite the fact that the system is known to be heterogeneous (i.e., some regions can achieve high prediction performance more easily than others). Here, we introduce a region-specific loss to lift the implicit assumption of homogeneous weighting for better learning. We divide the entire volume into multiple sub-regions, each with an individualized loss constructed for optimal local performance. Effectively, this scheme imposes higher weightings on the sub-regions that are more difficult to segment, and vice versa. Furthermore, the regional false positive and false negative errors are computed for each input image during a training step and the regional penalty is adjusted accordingly to enhance the overall accuracy of the prediction. Using different public and in-house medical image datasets, we demonstrate that the proposed regionally adaptive loss paradigm outperforms conventional methods in the multi-organ segmentations, without any modification to the neural network architecture or additional data preparation.
Collapse
|
27
|
Li C, Bagher-Ebadian H, Sultan RI, Elshaikh M, Movsas B, Zhu D, Chetty IJ. A new architecture combining convolutional and transformer-based networks for automatic 3D multi-organ segmentation on CT images. Med Phys 2023; 50:6990-7002. [PMID: 37738468 DOI: 10.1002/mp.16750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 08/08/2023] [Accepted: 08/13/2023] [Indexed: 09/24/2023] Open
Abstract
PURPOSE Deep learning-based networks have become increasingly popular in the field of medical image segmentation. The purpose of this research was to develop and optimize a new architecture for automatic segmentation of the prostate gland and normal organs in the pelvic, thoracic, and upper gastro-intestinal (GI) regions. METHODS We developed an architecture which combines a shifted-window (Swin) transformer with a convolutional U-Net. The network includes a parallel encoder, a cross-fusion block, and a CNN-based decoder to extract local and global information and merge related features on the same scale. A skip connection is applied between the cross-fusion block and decoder to integrate low-level semantic features. Attention gates (AGs) are integrated within the CNN to suppress features in image background regions. Our network is termed "SwinAttUNet." We optimized the architecture for automatic image segmentation. Training datasets consisted of planning-CT datasets from 300 prostate cancer patients from an institutional database and 100 CT datasets from a publicly available dataset (CT-ORG). Images were linearly interpolated and resampled to a spatial resolution of (1.0 × 1.0× 1.5) mm3 . A volume patch (192 × 192 × 96) was used for training and inference, and the dataset was split into training (75%), validation (10%), and test (15%) cohorts. Data augmentation transforms were applied consisting of random flip, rotation, and intensity scaling. The loss function comprised Dice and cross-entropy equally weighted and summed. We evaluated Dice coefficients (DSC), 95th percentile Hausdorff Distances (HD95), and Average Surface Distances (ASD) between results of our network and ground truth data. RESULTS SwinAttUNet, DSC values were 86.54 ± 1.21, 94.15 ± 1.17, and 87.15 ± 1.68% and HD95 values were 5.06 ± 1.42, 3.16 ± 0.93, and 5.54 ± 1.63 mm for the prostate, bladder, and rectum, respectively. Respective ASD values were 1.45 ± 0.57, 0.82 ± 0.12, and 1.42 ± 0.38 mm. For the lung, liver, kidneys and pelvic bones, respective DSC values were: 97.90 ± 0.80, 96.16 ± 0.76, 93.74 ± 2.25, and 89.31 ± 3.87%. Respective HD95 values were: 5.13 ± 4.11, 2.73 ± 1.19, 2.29 ± 1.47, and 5.31 ± 1.25 mm. Respective ASD values were: 1.88 ± 1.45, 1.78 ± 1.21, 0.71 ± 0.43, and 1.21 ± 1.11 mm. Our network outperformed several existing deep learning approaches using only attention-based convolutional or Transformer-based feature strategies, as detailed in the results section. CONCLUSIONS We have demonstrated that our new architecture combining Transformer- and convolution-based features is able to better learn the local and global context for automatic segmentation of multi-organ, CT-based anatomy.
Collapse
Affiliation(s)
- Chengyin Li
- College of Engineering - Dept. of Computer Science, Wayne State University, Detroit, Michigan, USA
| | - Hassan Bagher-Ebadian
- Department of Radiation Oncology, Henry Ford Cancer Institute, Detroit, Michigan, USA
- Department of Radiology, Michigan State University, East Lansing, Michigan, USA
- Department of Osteopathic Medicine, Michigan State University, East Lansing, Michigan, USA
- Department of Physics, Oakland University, Rochester, Michigan, USA
| | - Rafi Ibn Sultan
- College of Engineering - Dept. of Computer Science, Wayne State University, Detroit, Michigan, USA
| | - Mohamed Elshaikh
- Department of Radiation Oncology, Henry Ford Cancer Institute, Detroit, Michigan, USA
| | - Benjamin Movsas
- Department of Radiation Oncology, Henry Ford Cancer Institute, Detroit, Michigan, USA
| | - Dongxiao Zhu
- College of Engineering - Dept. of Computer Science, Wayne State University, Detroit, Michigan, USA
| | - Indrin J Chetty
- Department of Radiation Oncology, Henry Ford Cancer Institute, Detroit, Michigan, USA
- Department of Radiation Oncology, Cedars Sinai Medical Center, Los Angeles, CA, USA
| |
Collapse
|
28
|
Tian M, Wang H, Liu X, Ye Y, Ouyang G, Shen Y, Li Z, Wang X, Wu S. Delineation of clinical target volume and organs at risk in cervical cancer radiotherapy by deep learning networks. Med Phys 2023; 50:6354-6365. [PMID: 37246619 DOI: 10.1002/mp.16468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2022] [Revised: 04/17/2023] [Accepted: 04/28/2023] [Indexed: 05/30/2023] Open
Abstract
PURPOSE Delineation of the clinical target volume (CTV) and organs-at-risk (OARs) is important in cervical cancer radiotherapy. But it is generally labor-intensive, time-consuming, and subjective. This paper proposes a parallel-path attention fusion network (PPAF-net) to overcome these disadvantages in the delineation task. METHODS The PPAF-net utilizes both the texture and structure information of CTV and OARs by employing a U-Net network to capture the high-level texture information, and an up-sampling and down-sampling (USDS) network to capture the low-level structure information to accentuate the boundaries of CTV and OARs. Multi-level features extracted from both networks are then fused together through an attention module to generate the delineation result. RESULTS The dataset contains 276 computed tomography (CT) scans of patients with cervical cancer of staging IB-IIA. The images are provided by the West China Hospital of Sichuan University. Simulation results demonstrate that PPAF-net performs favorably on the delineation of the CTV and OARs (e.g., rectum, bladder and etc.) and achieves the state-of-the-art delineation accuracy, respectively, for the CTV and OARs. In terms of the Dice Similarity Coefficient (DSC) and the Hausdorff Distance (HD), 88.61% and 2.25 cm for the CTV, 92.27% and 0.73 cm for the rectum, 96.74% and 0.68 cm for the bladder, 96.38% and 0.65 cm for the left kidney, 96.79% and 0.63 cm for the right kidney, 93.42% and 0.52 cm for the left femoral head, 93.69% and 0.51 cm for the right femoral head, 87.53% and 1.07 cm for the small intestine, and 91.50% and 0.84 cm for the spinal cord. CONCLUSIONS The proposed automatic delineation network PPAF-net performs well on CTV and OARs segmentation tasks, which has great potential for reducing the burden of radiation oncologists and increasing the accuracy of delineation. In future, radiation oncologists from the West China Hospital of Sichuan University will further evaluate the results of network delineation, making this method helpful in clinical practice.
Collapse
Affiliation(s)
- Miao Tian
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Hongqiu Wang
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Xingang Liu
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Yuyun Ye
- Department of Electrical and Computer Engineering, University of Tulsa, Tulsa, USA
| | - Ganlu Ouyang
- Department of Radiation Oncology, Cancer Center, the West China Hospital of Sichuan University, Chengdu, China
| | - Yali Shen
- Department of Radiation Oncology, Cancer Center, the West China Hospital of Sichuan University, Chengdu, China
| | - Zhiping Li
- Department of Radiation Oncology, Cancer Center, the West China Hospital of Sichuan University, Chengdu, China
| | - Xin Wang
- Department of Radiation Oncology, Cancer Center, the West China Hospital of Sichuan University, Chengdu, China
| | - Shaozhi Wu
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| |
Collapse
|
29
|
Radiya K, Joakimsen HL, Mikalsen KØ, Aahlin EK, Lindsetmo RO, Mortensen KE. Performance and clinical applicability of machine learning in liver computed tomography imaging: a systematic review. Eur Radiol 2023; 33:6689-6717. [PMID: 37171491 PMCID: PMC10511359 DOI: 10.1007/s00330-023-09609-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 02/02/2023] [Accepted: 02/06/2023] [Indexed: 05/13/2023]
Abstract
OBJECTIVES Machine learning (ML) for medical imaging is emerging for several organs and image modalities. Our objectives were to provide clinicians with an overview of this field by answering the following questions: (1) How is ML applied in liver computed tomography (CT) imaging? (2) How well do ML systems perform in liver CT imaging? (3) What are the clinical applications of ML in liver CT imaging? METHODS A systematic review was carried out according to the guidelines from the PRISMA-P statement. The search string focused on studies containing content relating to artificial intelligence, liver, and computed tomography. RESULTS One hundred ninety-one studies were included in the study. ML was applied to CT liver imaging by image analysis without clinicians' intervention in majority of studies while in newer studies the fusion of ML method with clinical intervention have been identified. Several were documented to perform very accurately on reliable but small data. Most models identified were deep learning-based, mainly using convolutional neural networks. Potentially many clinical applications of ML to CT liver imaging have been identified through our review including liver and its lesion segmentation and classification, segmentation of vascular structure inside the liver, fibrosis and cirrhosis staging, metastasis prediction, and evaluation of chemotherapy. CONCLUSION Several studies attempted to provide transparent result of the model. To make the model convenient for a clinical application, prospective clinical validation studies are in urgent call. Computer scientists and engineers should seek to cooperate with health professionals to ensure this. KEY POINTS • ML shows great potential for CT liver image tasks such as pixel-wise segmentation and classification of liver and liver lesions, fibrosis staging, metastasis prediction, and retrieval of relevant liver lesions from similar cases of other patients. • Despite presenting the result is not standardized, many studies have attempted to provide transparent results to interpret the machine learning method performance in the literature. • Prospective studies are in urgent call for clinical validation of ML method, preferably carried out by cooperation between clinicians and computer scientists.
Collapse
Affiliation(s)
- Keyur Radiya
- Department of Gastroenterological Surgery at University Hospital of North Norway (UNN), Tromso, Norway.
- Department of Clinical Medicine, UiT The Arctic University of Norway, Tromso, Norway.
| | - Henrik Lykke Joakimsen
- Institute of Clinical Medicine, UiT The Arctic University of Norway, Tromso, Norway
- Centre for Clinical Artificial Intelligence (SPKI), University Hospital of North Norway, Tromso, Norway
| | - Karl Øyvind Mikalsen
- Department of Clinical Medicine, UiT The Arctic University of Norway, Tromso, Norway
- Centre for Clinical Artificial Intelligence (SPKI), University Hospital of North Norway, Tromso, Norway
- UiT Machine Learning Group, Department of Physics and Technology, UiT the Arctic University of Norway, Tromso, Norway
| | - Eirik Kjus Aahlin
- Department of Gastroenterological Surgery at University Hospital of North Norway (UNN), Tromso, Norway
| | - Rolv-Ole Lindsetmo
- Department of Clinical Medicine, UiT The Arctic University of Norway, Tromso, Norway
- Head Clinic of Surgery, Oncology and Women Health, University Hospital of North Norway, Tromso, Norway
| | - Kim Erlend Mortensen
- Department of Gastroenterological Surgery at University Hospital of North Norway (UNN), Tromso, Norway
- Department of Clinical Medicine, UiT The Arctic University of Norway, Tromso, Norway
| |
Collapse
|
30
|
Chen Y, Gensheimer MF, Bagshaw HP, Butler S, Yu L, Zhou Y, Shen L, Kovalchuk N, Surucu M, Chang DT, Xing L, Han B. Patient-Specific Auto-segmentation on Daily kVCT Images for Adaptive Radiation Therapy. Int J Radiat Oncol Biol Phys 2023; 117:505-514. [PMID: 37141982 DOI: 10.1016/j.ijrobp.2023.04.026] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Revised: 04/18/2023] [Accepted: 04/25/2023] [Indexed: 05/06/2023]
Abstract
PURPOSE This study explored deep-learning-based patient-specific auto-segmentation using transfer learning on daily RefleXion kilovoltage computed tomography (kVCT) images to facilitate adaptive radiation therapy, based on data from the first group of patients treated with the innovative RefleXion system. METHODS AND MATERIALS For head and neck (HaN) and pelvic cancers, a deep convolutional segmentation network was initially trained on a population data set that contained 67 and 56 patient cases, respectively. Then the pretrained population network was adapted to the specific RefleXion patient by fine-tuning the network weights with a transfer learning method. For each of the 6 collected RefleXion HaN cases and 4 pelvic cases, initial planning computed tomography (CT) scans and 5 to 26 sets of daily kVCT images were used for the patient-specific learning and evaluation separately. The performance of the patient-specific network was compared with the population network and the clinical rigid registration method and evaluated by the Dice similarity coefficient (DSC) with manual contours being the reference. The corresponding dosimetric effects resulting from different auto-segmentation and registration methods were also investigated. RESULTS The proposed patient-specific network achieved mean DSC results of 0.88 for 3 HaN organs at risk (OARs) of interest and 0.90 for 8 pelvic target and OARs, outperforming the population network (0.70 and 0.63) and the registration method (0.72 and 0.72). The DSC of the patient-specific network gradually increased with the increment of longitudinal training cases and approached saturation with more than 6 training cases. Compared with using the registration contour, the target and OAR mean doses and dose-volume histograms obtained using the patient-specific auto-segmentation were closer to the results using the manual contour. CONCLUSIONS Auto-segmentation of RefleXion kVCT images based on the patient-specific transfer learning could achieve higher accuracy, outperforming a common population network and clinical registration-based method. This approach shows promise in improving dose evaluation accuracy in RefleXion adaptive radiation therapy.
Collapse
Affiliation(s)
- Yizheng Chen
- Department of Radiation Oncology, Stanford University, Stanford, California
| | | | - Hilary P Bagshaw
- Department of Radiation Oncology, Stanford University, Stanford, California
| | - Santino Butler
- Department of Radiation Oncology, Stanford University, Stanford, California
| | - Lequan Yu
- Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong, China
| | - Yuyin Zhou
- Department of Computer Science and Engineering, University of California Santa Cruz, Santa Cruz, California
| | - Liyue Shen
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts
| | - Nataliya Kovalchuk
- Department of Radiation Oncology, Stanford University, Stanford, California
| | - Murat Surucu
- Department of Radiation Oncology, Stanford University, Stanford, California
| | - Daniel T Chang
- Department of Radiation Oncology, University of Michigan, Ann Arbor, Michigan
| | - Lei Xing
- Department of Radiation Oncology, Stanford University, Stanford, California
| | - Bin Han
- Department of Radiation Oncology, Stanford University, Stanford, California.
| |
Collapse
|
31
|
Hille G, Agrawal S, Tummala P, Wybranski C, Pech M, Surov A, Saalfeld S. Joint liver and hepatic lesion segmentation in MRI using a hybrid CNN with transformer layers. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 240:107647. [PMID: 37329803 DOI: 10.1016/j.cmpb.2023.107647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 04/21/2023] [Accepted: 06/05/2023] [Indexed: 06/19/2023]
Abstract
Backgound and Objective: Deep learning-based segmentation of the liver and hepatic lesions therein steadily gains relevance in clinical practice due to the increasing incidence of liver cancer each year. Whereas various network variants with overall promising results in the field of medical image segmentation have been successfully developed over the last years, almost all of them struggle with the challenge of accurately segmenting hepatic lesions in magnetic resonance imaging (MRI). This led to the idea of combining elements of convolutional and transformer-based architectures to overcome the existing limitations. METHODS This work presents a hybrid network called SWTR-Unet, consisting of a pretrained ResNet, transformer blocks as well as a common Unet-style decoder path. This network was primarily applied to single-modality non-contrast-enhanced liver MRI and additionally to the publicly available computed tomography (CT) data of the liver tumor segmentation (LiTS) challenge to verify the applicability on other modalities. For a broader evaluation, multiple state-of-the-art networks were implemented and applied, ensuring direct comparability. Furthermore, correlation analysis and an ablation study were carried out, to investigate various influencing factors on the segmentation accuracy of the presented method. RESULTS With Dice similarity scores of averaged 98±2% for liver and 81±28% lesion segmentation on the MRI dataset and 97±2% and 79±25%, respectively on the CT dataset, the proposed SWTR-Unet proved to be a precise approach for liver and hepatic lesion segmentation with state-of-the-art results for MRI and competing accuracy in CT imaging. CONCLUSION The achieved segmentation accuracy was found to be on par with manually performed expert segmentations as indicated by inter-observer variabilities for liver lesion segmentation. In conclusion, the presented method could save valuable time and resources in clinical practice.
Collapse
Affiliation(s)
- Georg Hille
- Department of Simulation and Graphics, Otto-von-Guericke University, Magdeburg, Germany.
| | - Shubham Agrawal
- Department of Simulation and Graphics, Otto-von-Guericke University, Magdeburg, Germany
| | - Pavan Tummala
- Department of Simulation and Graphics, Otto-von-Guericke University, Magdeburg, Germany
| | - Christian Wybranski
- Department of Radiology, University Hospital of Magdeburg, Magdeburg, Germany
| | - Maciej Pech
- Department of Radiology, University Hospital of Magdeburg, Magdeburg, Germany
| | - Alexey Surov
- Department of Radiology, University Hospital of Magdeburg, Magdeburg, Germany
| | - Sylvia Saalfeld
- Department of Simulation and Graphics, Otto-von-Guericke University, Magdeburg, Germany
| |
Collapse
|
32
|
Saumiya S, Franklin SW. Residual Deformable Split Channel and Spatial U-Net for Automated Liver and Liver Tumour Segmentation. J Digit Imaging 2023; 36:2164-2178. [PMID: 37464213 PMCID: PMC10501969 DOI: 10.1007/s10278-023-00874-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 06/20/2023] [Accepted: 06/21/2023] [Indexed: 07/20/2023] Open
Abstract
Accurate segmentation of the liver and liver tumour (LT) is challenging due to its hazy boundaries and large shape variability. Although using U-Net for liver and LT segmentation achieves better results than manual segmentation, it loses spatial and channel features during segmentation, leading to inaccurate liver and LT segmentation. A residual deformable split depth-wise separable U-Net (RDSDSU-Net) is proposed to increase the accuracy of liver and LT segmentation. The residual deformable convolution layer (DCL) with deformable pooling (DP) is used in the encoder as an attention mechanism to adaptively extract liver and LT shape and position characteristics. Afterward, a convolutional spatial and channel features split graph network (CSCFSG-Net) is introduced in the middle processing layer to improve the expression capability of the liver and LT features by capturing spatial and channel features separately and to extract global contextual liver and LT information from spatial and channel features. Sub-pixel convolutions (SPC) are used in the decoder section to prevent the segmentation results from having a chequerboard artefact effect. Also, the residual deformable encoder features are combined with the decoder through summation to avoid increasing the number of feature maps (FM). Finally, the efficiency of the RDSDSU-Net is evaluated on the 3DIRCADb and LiTS datasets. The DICE score of the proposed RDSDSU-Net achieved 98.21% for liver segmentation and 93.25% for LT segmentation on 3DIRCADb. The experimental outcomes illustrate that the proposed RDSDSU-Net model achieved better segmentation results than the existing techniques.
Collapse
Affiliation(s)
- S Saumiya
- Department of ECE, Bethlahem Institute of Engineering, Karungal, Tamil Nadu India
| | - S Wilfred Franklin
- Department of ECE, CSI Institute of Technology, Thovalai, Tamil Nadu India
| |
Collapse
|
33
|
Liu Y, Liang P, Liang K, Chang Q. Automatic and efficient pneumothorax segmentation from CT images using EFA-Net with feature alignment function. Sci Rep 2023; 13:15291. [PMID: 37714871 PMCID: PMC10504271 DOI: 10.1038/s41598-023-42388-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 09/09/2023] [Indexed: 09/17/2023] Open
Abstract
Pneumothorax is a condition involving a collapsed lung, which requires accurate segmentation of computed tomography (CT) images for effective clinical decision-making. Numerous convolutional neural network-based methods for medical image segmentation have been proposed, but they often struggle to balance model complexity with performance. To address this, we introduce the Efficient Feature Alignment Network (EFA-Net), a novel medical image segmentation network designed specifically for pneumothorax CT segmentation. EFA-Net uses EfficientNet as an encoder to extract features and a Feature Alignment (FA) module as a decoder to align features in both the spatial and channel dimensions. This design allows EFA-Net to achieve superior segmentation performance with reduced model complexity. In our dataset, our method outperforms various state-of-the-art methods in terms of accuracy and efficiency, achieving a Dice coefficient of 90.03%, an Intersection over Union (IOU) of 81.80%, and a sensitivity of 88.94%. Notably, EFA-Net has significantly lower FLOPs (1.549G) and parameters (0.432M), offering better robustness and facilitating easier deployment. Future work will explore the integration of downstream applications to enhance EFA-Net's utility for clinicians and patients in real-world diagnostic scenarios. The source code of EFA-Net is available at: https://github.com/tianjiamutangchun/EFA-Net .
Collapse
Affiliation(s)
- Yinghao Liu
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, China
- Shanghai University of Medicine and Health Sciences, Shanghai, 200237, China
- Department of Surgery, Shanghai Key Laboratory of Gastric Neoplasms, Shanghai Institute of Digestive Surgery, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Pengchen Liang
- School of Microelectronics, Shanghai University, Shanghai, 201800, China
| | - Kaiyi Liang
- Department of Radiology, Jiading District Central Hospital Affiliated Shanghai University of Medicine & Health Sciences, Key Laboratory of Shanghai Municipal Health Commission for Smart Image, Shanghai, 201800, China.
| | - Qing Chang
- Department of Surgery, Shanghai Key Laboratory of Gastric Neoplasms, Shanghai Institute of Digestive Surgery, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
| |
Collapse
|
34
|
M GJ, S B. DeepNet model empowered cuckoo search algorithm for the effective identification of lung cancer nodules. FRONTIERS IN MEDICAL TECHNOLOGY 2023; 5:1157919. [PMID: 37752910 PMCID: PMC10518616 DOI: 10.3389/fmedt.2023.1157919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Accepted: 08/22/2023] [Indexed: 09/28/2023] Open
Abstract
Introduction Globally, lung cancer is a highly harmful type of cancer. An efficient diagnosis system can enable pathologists to recognize the type and nature of lung nodules and the mode of therapy to increase the patient's chance of survival. Hence, implementing an automatic and reliable system to segment lung nodules from a computed tomography (CT) image is useful in the medical industry. Methods This study develops a novel fully convolutional deep neural network (hereafter called DeepNet) model for segmenting lung nodules from CT scans. This model includes an encoder/decoder network that achieves pixel-wise image segmentation. The encoder network exploits a Visual Geometry Group (VGG-19) model as a base architecture, while the decoder network exploits 16 upsampling and deconvolution modules. The encoder used in this model has a very flexible structural design that can be modified and trained for any resolution based on the size of input scans. The decoder network upsamples and maps the low-resolution attributes of the encoder. Thus, there is a considerable drop in the number of variables used for the learning process as the network recycles the pooling indices of the encoder for segmentation. The Thresholding method and the cuckoo search algorithm determines the most useful features when categorizing cancer nodules. Results and discussion The effectiveness of the intended DeepNet model is cautiously assessed on the real-world database known as The Cancer Imaging Archive (TCIA) dataset and its effectiveness is demonstrated by comparing its representation with some other modern segmentation models in terms of selected performance measures. The empirical analysis reveals that DeepNet significantly outperforms other prevalent segmentation algorithms with 0.962 ± 0.023% of volume error, 0.968 ± 0.011 of dice similarity coefficient, 0.856 ± 0.011 of Jaccard similarity index, and 0.045 ± 0.005s average processing time.
Collapse
Affiliation(s)
- Grace John M
- Department of Electronics and Communication, Karpagam Academy of Higher Education, Coimbatore, India
| | | |
Collapse
|
35
|
Li W, Jia M, Yang C, Lin Z, Yu Y, Zhang W. SPA-UNet: A liver tumor segmentation network based on fused multi-scale features. Open Life Sci 2023; 18:20220685. [PMID: 37724113 PMCID: PMC10505346 DOI: 10.1515/biol-2022-0685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 06/26/2023] [Accepted: 07/24/2023] [Indexed: 09/20/2023] Open
Abstract
Liver tumor segmentation is a critical part in the diagnosis and treatment of liver cancer. While U-shaped convolutional neural networks (UNets) have made significant strides in medical image segmentation, challenges remain in accurately segmenting tumor boundaries and detecting small tumors, resulting in low segmentation accuracy. To improve the segmentation accuracy of liver tumors, this work proposes space pyramid attention (SPA)-UNet, a novel image segmentation network with an encoder-decoder architecture. SPA-UNet consists of four modules: (1) Spatial pyramid convolution block (SPCB), extracting multi-scale features by fusing three sets of dilated convolutions with different rates. (2) Spatial pyramid pooling block (SPPB), performing downsampling to reduce image size. (3) Upsample module, integrating dense positional and semantic information. (4) Residual attention block (RA-Block), enabling precise tumor localization. The encoder incorporates 5 SPCBs and 4 SPPBs to capture contextual information. The decoder consists of the Upsample module and RA-Block, and finally a segmentation head outputs segmented images of liver and liver tumor. Experiments using the liver tumor segmentation dataset demonstrate that SPA-UNet surpasses the traditional UNet model, achieving a 1.0 and 2.0% improvement in intersection over union indicators for liver and tumors, respectively, along with increased recall rates by 1.2 and 1.8%. These advancements provide a dependable foundation for liver cancer diagnosis and treatment.
Collapse
Affiliation(s)
- Weikun Li
- School of Computer and Information Security, Guilin University of Electronic Technology, Guilin, Guangxi, 541000, China
| | - Maoning Jia
- School of Computer and Information Security, Guilin University of Electronic Technology, Guilin, Guangxi, 541000, China
| | - Chen Yang
- School of Business, Guilin University of Electronic Technology, Guilin, Guangxi, 541000, China
| | - Zhenyuan Lin
- School of Computer and Information Security, Guilin University of Electronic Technology, Guilin, Guangxi, 541000, China
| | - Yuekang Yu
- School of Information and Communication, Guilin University of Electronic Technology, Guilin, Guangxi, 541000, China
| | - Wenhui Zhang
- School of Computer and Information Security, Guilin University of Electronic Technology, Guilin, Guangxi, 541000, China
| |
Collapse
|
36
|
Chen Y, Zheng C, Zhang W, Lin H, Chen W, Zhang G, Xu G, Wu F. MS-FANet: Multi-scale feature attention network for liver tumor segmentation. Comput Biol Med 2023; 163:107208. [PMID: 37421737 DOI: 10.1016/j.compbiomed.2023.107208] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 06/07/2023] [Accepted: 06/25/2023] [Indexed: 07/10/2023]
Abstract
Accurate segmentation of liver tumors is a prerequisite for early diagnosis of liver cancer. Segmentation networks extract features continuously at the same scale, which cannot adapt to the variation of liver tumor volume in computed tomography (CT). Hence, a multi-scale feature attention network (MS-FANet) for liver tumor segmentation is proposed in this paper. The novel residual attention (RA) block and multi-scale atrous downsampling (MAD) are introduced in the encoder of MS-FANet to sufficiently learn variable tumor features and extract tumor features at different scales simultaneously. The dual-path feature (DF) filter and dense upsampling (DU) are introduced in the feature reduction process to reduce effective features for the accurate segmentation of liver tumors. On the public LiTS dataset and 3DIRCADb dataset, MS-FANet achieved 74.2% and 78.0% of average Dice, respectively, outperforming most state-of-the-art networks, this strongly proves the excellent liver tumor segmentation performance and the ability to learn features at different scales.
Collapse
Affiliation(s)
- Ying Chen
- School of Software, Nanchang Hangkong University, Nanchang, 330063, PR China
| | - Cheng Zheng
- School of Software, Nanchang Hangkong University, Nanchang, 330063, PR China.
| | - Wei Zhang
- School of Software, Nanchang Hangkong University, Nanchang, 330063, PR China
| | - Hongping Lin
- School of Software, Nanchang Hangkong University, Nanchang, 330063, PR China
| | - Wang Chen
- School of Software, Nanchang Hangkong University, Nanchang, 330063, PR China
| | - Guimei Zhang
- Institute of Computer Vision, Nanchang Hangkong University, Nanchang, 330063, PR China
| | - Guohui Xu
- Department of Hepatobiliary Surgery, Jiangxi Cancer Hospital, Nanchang, 330029, PR China.
| | - Fang Wu
- Department of Gastroenterology, First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325035, PR China.
| |
Collapse
|
37
|
Liu J, Xiao H, Fan J, Hu W, Yang Y, Dong P, Xing L, Cai J. An overview of artificial intelligence in medical physics and radiation oncology. JOURNAL OF THE NATIONAL CANCER CENTER 2023; 3:211-221. [PMID: 39035195 PMCID: PMC11256546 DOI: 10.1016/j.jncc.2023.08.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Revised: 05/03/2023] [Accepted: 08/08/2023] [Indexed: 07/23/2024] Open
Abstract
Artificial intelligence (AI) is developing rapidly and has found widespread applications in medicine, especially radiotherapy. This paper provides a brief overview of AI applications in radiotherapy, and highlights the research directions of AI that can potentially make significant impacts and relevant ongoing research works in these directions. Challenging issues related to the clinical applications of AI, such as robustness and interpretability of AI models, are also discussed. The future research directions of AI in the field of medical physics and radiotherapy are highlighted.
Collapse
Affiliation(s)
- Jiali Liu
- Department of Clinical Oncology, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
- Department of Clinical Oncology, Hong Kong University Li Ka Shing Medical School, Hong Kong, China
| | - Haonan Xiao
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong, China
| | - Jiawei Fan
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
- Shanghai Key Laboratory of Radiation Oncology, Shanghai, China
| | - Weigang Hu
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
- Shanghai Key Laboratory of Radiation Oncology, Shanghai, China
| | - Yong Yang
- Department of Radiation Oncology, Stanford University, CA, USA
| | - Peng Dong
- Department of Radiation Oncology, Stanford University, CA, USA
| | - Lei Xing
- Department of Radiation Oncology, Stanford University, CA, USA
| | - Jing Cai
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong, China
| |
Collapse
|
38
|
Feng Y, Cong Y, Xing S, Wang H, Zhao C, Zhang X, Yao Q. Distance Matters: A Distance-Aware Medical Image Segmentation Algorithm. ENTROPY (BASEL, SWITZERLAND) 2023; 25:1169. [PMID: 37628199 PMCID: PMC10453236 DOI: 10.3390/e25081169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 08/01/2023] [Accepted: 08/03/2023] [Indexed: 08/27/2023]
Abstract
The transformer-based U-Net network structure has gained popularity in the field of medical image segmentation. However, most networks overlook the impact of the distance between each patch on the encoding process. This paper proposes a novel GC-TransUnet for medical image segmentation. The key innovation is that it takes into account the relationships between patch blocks based on their distances, optimizing the encoding process in traditional transformer networks. This optimization results in improved encoding efficiency and reduced computational costs. Moreover, the proposed GC-TransUnet is combined with U-Net to accomplish the segmentation task. In the encoder part, the traditional vision transformer is replaced by the global context vision transformer (GC-VIT), eliminating the need for the CNN network while retaining skip connections for subsequent decoders. Experimental results demonstrate that the proposed algorithm achieves superior segmentation results compared to other algorithms when applied to medical images.
Collapse
Affiliation(s)
- Yuncong Feng
- College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China; (Y.F.); (Y.C.); (S.X.); (H.W.); (C.Z.); (Q.Y.)
- Artificial Intelligence Research Institute, Changchun University of Technology, Changchun 130012, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China
| | - Yeming Cong
- College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China; (Y.F.); (Y.C.); (S.X.); (H.W.); (C.Z.); (Q.Y.)
| | - Shuaijie Xing
- College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China; (Y.F.); (Y.C.); (S.X.); (H.W.); (C.Z.); (Q.Y.)
| | - Hairui Wang
- College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China; (Y.F.); (Y.C.); (S.X.); (H.W.); (C.Z.); (Q.Y.)
| | - Cuixing Zhao
- College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China; (Y.F.); (Y.C.); (S.X.); (H.W.); (C.Z.); (Q.Y.)
| | - Xiaoli Zhang
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China
| | - Qingan Yao
- College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China; (Y.F.); (Y.C.); (S.X.); (H.W.); (C.Z.); (Q.Y.)
| |
Collapse
|
39
|
Zhou H, Sun C, Huang H, Fan M, Yang X, Zhou L. Feature-guided attention network for medical image segmentation. Med Phys 2023; 50:4871-4886. [PMID: 36746870 DOI: 10.1002/mp.16253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 01/03/2023] [Accepted: 01/06/2023] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND U-Net and its variations have achieved remarkable performances in medical image segmentation. However, they have two limitations. First, the shallow layer feature of the encoder always contains background noise. Second, semantic gaps exist between the features of the encoder and the decoder. Skip-connections directly connect the encoder to the decoder, which will lead to the fusion of semantically dissimilar feature maps. PURPOSE To overcome these two limitations, this paper proposes a novel medical image segmentation algorithm, called feature-guided attention network, which consists of U-Net, the cross-level attention filtering module (CAFM), and the attention-guided upsampling module (AUM). METHODS In the proposed method, the AUM and the CAFM were introduced into the U-Net, where the AUM learns to filter the background noise in the low-level feature map of the encoder and the CAFM tries to eliminate the semantic gap between the encoder and the decoder. Specifically, the AUM adopts a top-down pathway to use the high-level feature map so as to filter the background noise in the low-level feature map of the encoder. The AUM uses the encoder features to guide the upsampling of the corresponding decoder features, thus eliminating the semantic gap between them. Four medical image segmentation tasks, including coronary atherosclerotic plaque segmentation (Dataset A), retinal vessel segmentation (Dataset B), skin lesion segmentation (Dataset C), and multiclass retinal edema lesions segmentation (Dataset D), were used to validate the proposed method. RESULTS For Dataset A, the proposed method achieved higher Intersection over Union (IoU) (67.91 ± 3.82 % $67.91\pm 3.82\%$ ), dice (79.39 ± 3.37 % $79.39\pm 3.37\%$ ), accuracy (98.39 ± 0.34 % $98.39\pm 0.34\%$ ), and sensitivity (85.10 ± 3.74 % $85.10\pm 3.74\%$ ) than the previous best method: CA-Net. For Dataset B, the proposed method achieved higher sensitivity (83.50%) and accuracy (97.55%) than the previous best method: SCS-Net. For Dataset C, the proposed method had highest IoU (83.47 ± 0.41 % $83.47\pm 0.41\%$ ) and dice (90.81 ± 0.34 % $90.81\pm 0.34\%$ ) than those of all compared previous methods. For Dataset D, the proposed method had highest dice (average: 81.53%; retina edema area [REA]: 83.78%; pigment epithelial detachment [PED] 77.13%), sensitivity (REA: 89.01%; SRF: 85.50%), specificity (REA: 99.35%; PED: 100.00), and accuracy (98.73%) among all compared previous networks. In addition, the number of parameters of the proposed method was 2.43 M, which is less than CA-Net (3.21 M) and CPF-Net (3.07 M). CONCLUSIONS The proposed method demonstrated state-of-the-art performance, outperforming other top-notch medical image segmentation algorithms. The CAFM filtered the background noise in the low-level feature map of the encoder, while the AUM eliminated the semantic gap between the encoder and the decoder. Furthermore, the proposed method was of high computational efficiency.
Collapse
Affiliation(s)
- Hao Zhou
- National Key Laboratory of Science and Technology of Underwater Vehicle, Harbin Engineering University, Harbin, China
| | - Chaoyu Sun
- Fourth Affiliated Hospital, Harbin Medical University, Harbin, China
| | - Hai Huang
- National Key Laboratory of Science and Technology of Underwater Vehicle, Harbin Engineering University, Harbin, China
| | - Mingyu Fan
- College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, China
| | - Xu Yang
- State Key Laboratory of Management and Control for Complex System, Institute of Automation, Chinese Academy of Sciences, Beijing, China
| | - Linxiao Zhou
- Fourth Affiliated Hospital, Harbin Medical University, Harbin, China
| |
Collapse
|
40
|
Li Y, Zou B, Dai P, Liao M, Bai HX, Jiao Z. AC-E Network: Attentive Context-Enhanced Network for Liver Segmentation. IEEE J Biomed Health Inform 2023; 27:4052-4061. [PMID: 37204947 DOI: 10.1109/jbhi.2023.3278079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Segmentation of liver from CT scans is essential in computer-aided liver disease diagnosis and treatment. However, the 2DCNN ignores the 3D context, and the 3DCNN suffers from numerous learnable parameters and high computational cost. In order to overcome this limitation, we propose an Attentive Context-Enhanced Network (AC-E Network) consisting of 1) an attentive context encoding module (ACEM) that can be integrated into the 2D backbone to extract 3D context without a sharp increase in the number of learnable parameters; 2) a dual segmentation branch including complemental loss making the network attend to both the liver region and boundary so that getting the segmented liver surface with high accuracy. Extensive experiments on the LiTS and the 3D-IRCADb datasets demonstrate that our method outperforms existing approaches and is competitive to the state-of-the-art 2D-3D hybrid method on the equilibrium of the segmentation precision and the number of model parameters.
Collapse
|
41
|
Glänzer L, Masalkhi HE, Roeth AA, Schmitz-Rode T, Slabu I. Vessel Delineation Using U-Net: A Sparse Labeled Deep Learning Approach for Semantic Segmentation of Histological Images. Cancers (Basel) 2023; 15:3773. [PMID: 37568589 PMCID: PMC10417575 DOI: 10.3390/cancers15153773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 07/20/2023] [Accepted: 07/21/2023] [Indexed: 08/13/2023] Open
Abstract
Semantic segmentation is an important imaging analysis method enabling the identification of tissue structures. Histological image segmentation is particularly challenging, having large structural information while providing only limited training data. Additionally, labeling these structures to generate training data is time consuming. Here, we demonstrate the feasibility of a semantic segmentation using U-Net with a novel sparse labeling technique. The basic U-Net architecture was extended by attention gates, residual and recurrent links, and dropout regularization. To overcome the high class imbalance, which is intrinsic to histological data, under- and oversampling and data augmentation were used. In an ablation study, various architectures were evaluated, and the best performing model was identified. This model contains attention gates, residual links, and a dropout regularization of 0.125. The segmented images show accurate delineations of the vascular structures (with a precision of 0.9088 and an AUC-ROC score of 0.9717), and the segmentation algorithm is robust to images containing staining variations and damaged tissue. These results demonstrate the feasibility of sparse labeling in combination with the modified U-Net architecture.
Collapse
Affiliation(s)
- Lukas Glänzer
- Institute of Applied Medical Engineering, Helmholtz Institute, Medical Faculty, RWTH Aachen University, Pauwelsstraße 20, 52074 Aachen, Germany; (L.G.); (H.E.M.); (T.S.-R.)
| | - Husam E. Masalkhi
- Institute of Applied Medical Engineering, Helmholtz Institute, Medical Faculty, RWTH Aachen University, Pauwelsstraße 20, 52074 Aachen, Germany; (L.G.); (H.E.M.); (T.S.-R.)
| | - Anjali A. Roeth
- Department of Visceral and Transplantation Surgery, University Hospital RWTH Aachen, Pauwelsstrasse 30, 52074 Aachen, Germany;
- Department of Surgery, Maastricht University, P. Debyelaan 25, 6229 Maastricht, The Netherlands
| | - Thomas Schmitz-Rode
- Institute of Applied Medical Engineering, Helmholtz Institute, Medical Faculty, RWTH Aachen University, Pauwelsstraße 20, 52074 Aachen, Germany; (L.G.); (H.E.M.); (T.S.-R.)
| | - Ioana Slabu
- Institute of Applied Medical Engineering, Helmholtz Institute, Medical Faculty, RWTH Aachen University, Pauwelsstraße 20, 52074 Aachen, Germany; (L.G.); (H.E.M.); (T.S.-R.)
| |
Collapse
|
42
|
Costanzo A, Ertl-Wagner B, Sussman D. AFNet Algorithm for Automatic Amniotic Fluid Segmentation from Fetal MRI. Bioengineering (Basel) 2023; 10:783. [PMID: 37508809 PMCID: PMC10376488 DOI: 10.3390/bioengineering10070783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 06/25/2023] [Accepted: 06/27/2023] [Indexed: 07/30/2023] Open
Abstract
Amniotic Fluid Volume (AFV) is a crucial fetal biomarker when diagnosing specific fetal abnormalities. This study proposes a novel Convolutional Neural Network (CNN) model, AFNet, for segmenting amniotic fluid (AF) to facilitate clinical AFV evaluation. AFNet was trained and tested on a manually segmented and radiologist-validated AF dataset. AFNet outperforms ResUNet++ by using efficient feature mapping in the attention block and transposing convolutions in the decoder. Our experimental results show that AFNet achieved a mean Intersection over Union (mIoU) of 93.38% on our dataset, thereby outperforming other state-of-the-art models. While AFNet achieves performance scores similar to those of the UNet++ model, it does so while utilizing merely less than half the number of parameters. By creating a detailed AF dataset with an improved CNN architecture, we enable the quantification of AFV in clinical practice, which can aid in diagnosing AF disorders during gestation.
Collapse
Affiliation(s)
- Alejo Costanzo
- Department of Electrical, Computer and Biomedical Engineering, Faculty of Engineering and Architectural Sciences, Toronto Metropolitan University, Toronto, ON M5B 2K3, Canada
- Institute for Biomedical Engineering, Science and Technology (iBEST), Toronto Metropolitan University and St. Michael's Hospital, Toronto, ON M5B 1T8, Canada
| | - Birgit Ertl-Wagner
- Department of Diagnostic Imaging, The Hospital for Sick Children, Toronto, ON M5G 1X8, Canada
- Department of Medical Imaging, Faculty of Medicine, University of Toronto, Toronto, ON M5T 1W7, Canada
| | - Dafna Sussman
- Department of Electrical, Computer and Biomedical Engineering, Faculty of Engineering and Architectural Sciences, Toronto Metropolitan University, Toronto, ON M5B 2K3, Canada
- Institute for Biomedical Engineering, Science and Technology (iBEST), Toronto Metropolitan University and St. Michael's Hospital, Toronto, ON M5B 1T8, Canada
- Department of Obstetrics and Gynecology, Faculty of Medicine, University of Toronto, Toronto, ON M5G 1E2, Canada
| |
Collapse
|
43
|
Dumitru RG, Peteleaza D, Craciun C. Using DUCK-Net for polyp image segmentation. Sci Rep 2023; 13:9803. [PMID: 37328572 PMCID: PMC10276013 DOI: 10.1038/s41598-023-36940-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Accepted: 06/13/2023] [Indexed: 06/18/2023] Open
Abstract
This paper presents a novel supervised convolutional neural network architecture, "DUCK-Net", capable of effectively learning and generalizing from small amounts of medical images to perform accurate segmentation tasks. Our model utilizes an encoder-decoder structure with a residual downsampling mechanism and a custom convolutional block to capture and process image information at multiple resolutions in the encoder segment. We employ data augmentation techniques to enrich the training set, thus increasing our model's performance. While our architecture is versatile and applicable to various segmentation tasks, in this study, we demonstrate its capabilities specifically for polyp segmentation in colonoscopy images. We evaluate the performance of our method on several popular benchmark datasets for polyp segmentation, Kvasir-SEG, CVC-ClinicDB, CVC-ColonDB, and ETIS-LARIBPOLYPDB showing that it achieves state-of-the-art results in terms of mean Dice coefficient, Jaccard index, Precision, Recall, and Accuracy. Our approach demonstrates strong generalization capabilities, achieving excellent performance even with limited training data.
Collapse
|
44
|
Lal S. TC-SegNet: robust deep learning network for fully automatic two-chamber segmentation of two-dimensional echocardiography. MULTIMEDIA TOOLS AND APPLICATIONS 2023:1-19. [PMID: 37362663 PMCID: PMC10238771 DOI: 10.1007/s11042-023-15524-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 10/03/2022] [Accepted: 04/19/2023] [Indexed: 06/28/2023]
Abstract
Heart chamber quantification is an essential clinical task to analyze heart abnormalities by evaluating the heart volume estimated through the endocardial border of the chambers. A precise heart chamber segmentation algorithm using echocardiography is essential for improving the diagnosis of cardiac disease. This paper proposes a robust two chamber segmentation network (TC-SegNet) for echocardiography which follows a U-Net architecture and effectively incorporates the proposed modified skip connection, Atrous Spatial Pyramid Pooling (ASPP) modules and squeeze and excitation modules. The TC-SegNet is evaluated on the open-source fully annotated dataset of cardiac acquisitions for multi-structure ultrasound segmentation (CAMUS). The proposed TC-SegNet obtained an average value of F1-score of 0.91, an average Dice score of 0.9284 and an IoU score of 0.8322 which are higher than the reference models used here for comparison. Further, Pixel error (PE) of 1.5109 which are significantly less than the comparison models. The segmentation results and metrics show that the proposed model outperforms the state-of-the-art segmentation methods.
Collapse
Affiliation(s)
- Shyam Lal
- Department of Electronics and Communication Engineering, National Institute of Technology Karnataka, Surathkal, Mangaluru, 575025 Karnataka India
| |
Collapse
|
45
|
Chen G, Li Z, Wang J, Wang J, Du S, Zhou J, Shi J, Zhou Y. An improved 3D KiU-Net for segmentation of liver tumor. Comput Biol Med 2023; 160:107006. [PMID: 37159962 DOI: 10.1016/j.compbiomed.2023.107006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 04/08/2023] [Accepted: 05/03/2023] [Indexed: 05/11/2023]
Abstract
It is a challenging task to accurately segment liver tumors from Computed Tomography (CT) images. The widely used U-Net and its variants generally suffer from the issue to accurately segment the detailed edges of small tumors, because the progressive down sampling operations in the encoder module will gradually increase the receptive fields. These enlarged receptive filed have limited ability to learn the information about tiny structures. KiU-Net is a newly proposed dual-branch model that can effectively perform image segmentation for small targets. However, the 3D version of KiU-Net has high computational complexity, which limits its application. In this work, an improved 3D KiU-Net (named TKiU-NeXt) is proposed for liver tumor segmentation from CT images. In TKiU-NeXt, a Transformer-based Kite-Net (TK-Net) branch is proposed to build the over-complete architecture to learn more detailed features for small structures, and an extended 3D version of UNeXt is developed to replace the original U-Net branch, which can effectively reduce computational complexity but still with superior segmentation performance. Moreover, a Mutual Guided Fusion Block (MGFB) is designed to effectively learn more features from two branches and then fuse the complementary features for image segmentation. The experimental results on two public CT datasets and a private dataset demonstrate that the proposed TKiU-NeXt outperforms all the compared algorithms, and it also has less computational complexity. It suggests the effectiveness and efficiency of TKiU-NeXt.
Collapse
Affiliation(s)
- Guodong Chen
- Key Laboratory of Specialty Fiber Optics and Optical Access Networks, Joint International Research Laboratory of Specialty Fiber Optics and Advanced Communication, School of Communication and Information Engineering, Shanghai University, China; Shanghai Institute for Advanced Communication and Data Science, Shanghai University, China
| | - Zheng Li
- Key Laboratory of Specialty Fiber Optics and Optical Access Networks, Joint International Research Laboratory of Specialty Fiber Optics and Advanced Communication, School of Communication and Information Engineering, Shanghai University, China; Shanghai Institute for Advanced Communication and Data Science, Shanghai University, China
| | - Jian Wang
- Key Laboratory of Specialty Fiber Optics and Optical Access Networks, Joint International Research Laboratory of Specialty Fiber Optics and Advanced Communication, School of Communication and Information Engineering, Shanghai University, China; Shanghai Institute for Advanced Communication and Data Science, Shanghai University, China
| | - Jun Wang
- Key Laboratory of Specialty Fiber Optics and Optical Access Networks, Joint International Research Laboratory of Specialty Fiber Optics and Advanced Communication, School of Communication and Information Engineering, Shanghai University, China; Shanghai Institute for Advanced Communication and Data Science, Shanghai University, China
| | - Shisuo Du
- Department of Radiation Oncology, Zhongshan Hospital Fudan University Shanghai, China
| | - Jinghao Zhou
- University of Maryland School of Medicine, Baltimore, MD, USA
| | - Jun Shi
- Key Laboratory of Specialty Fiber Optics and Optical Access Networks, Joint International Research Laboratory of Specialty Fiber Optics and Advanced Communication, School of Communication and Information Engineering, Shanghai University, China; Shanghai Institute for Advanced Communication and Data Science, Shanghai University, China.
| | - Yongkang Zhou
- Department of Radiation Oncology, Zhongshan Hospital Fudan University Shanghai, China.
| |
Collapse
|
46
|
Sang S, Zhou Y, Islam MT, Xing L. Small-Object Sensitive Segmentation Using Across Feature Map Attention. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:6289-6306. [PMID: 36178991 PMCID: PMC10823909 DOI: 10.1109/tpami.2022.3211171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Semantic segmentation is an important step in understanding the scene for many practical applications such as autonomous driving. Although Deep Convolutional Neural Networks-based methods have significantly improved segmentation accuracy, small/thin objects remain challenging to segment due to convolutional and pooling operations that result in information loss, especially for small objects. This article presents a novel attention-based method called Across Feature Map Attention (AFMA) to address this challenge. It quantifies the inner-relationship between small and large objects belonging to the same category by utilizing the different feature levels of the original image. The AFMA could compensate for the loss of high-level feature information of small objects and improve the small/thin object segmentation. Our method can be used as an efficient plug-in for a wide range of existing architectures and produces much more interpretable feature representation than former studies. Extensive experiments on eight widely used segmentation methods and other existing small-object segmentation models on CamVid and Cityscapes demonstrate that our method substantially and consistently improves the segmentation of small/thin objects.
Collapse
|
47
|
Zhang H, Zhong X, Li G, Liu W, Liu J, Ji D, Li X, Wu J. BCU-Net: Bridging ConvNeXt and U-Net for medical image segmentation. Comput Biol Med 2023; 159:106960. [PMID: 37099973 DOI: 10.1016/j.compbiomed.2023.106960] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 04/12/2023] [Accepted: 04/17/2023] [Indexed: 04/28/2023]
Abstract
Medical image segmentation enables doctors to observe lesion regions better and make accurate diagnostic decisions. Single-branch models such as U-Net have achieved great progress in this field. However, the complementary local and global pathological semantics of heterogeneous neural networks have not yet been fully explored. The class-imbalance problem remains a serious issue. To alleviate these two problems, we propose a novel model called BCU-Net, which leverages the advantages of ConvNeXt in global interaction and U-Net in local processing. We propose a new multilabel recall loss (MRL) module to relieve the class imbalance problem and facilitate deep-level fusion of local and global pathological semantics between the two heterogeneous branches. Extensive experiments were conducted on six medical image datasets including retinal vessel and polyp images. The qualitative and quantitative results demonstrate the superiority and generalizability of BCU-Net. In particular, BCU-Net can handle diverse medical images with diverse resolutions. It has a flexible structure owing to its plug-and-play characteristics, which promotes its practicality.
Collapse
Affiliation(s)
- Hongbin Zhang
- School of Software, East China Jiaotong University, China.
| | - Xiang Zhong
- School of Software, East China Jiaotong University, China.
| | - Guangli Li
- School of Information Engineering, East China Jiaotong University, China.
| | - Wei Liu
- School of Software, East China Jiaotong University, China.
| | - Jiawei Liu
- School of Software, East China Jiaotong University, China.
| | - Donghong Ji
- School of Cyber Science and Engineering, Wuhan University, China.
| | - Xiong Li
- School of Software, East China Jiaotong University, China.
| | - Jianguo Wu
- The Second Affiliated Hospital of Nanchang University, China.
| |
Collapse
|
48
|
Zheng T, Qin H, Cui Y, Wang R, Zhao W, Zhang S, Geng S, Zhao L. Segmentation of thyroid glands and nodules in ultrasound images using the improved U-Net architecture. BMC Med Imaging 2023; 23:56. [PMID: 37060061 PMCID: PMC10105426 DOI: 10.1186/s12880-023-01011-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 04/05/2023] [Indexed: 04/16/2023] Open
Abstract
BACKGROUND Identifying thyroid nodules' boundaries is crucial for making an accurate clinical assessment. However, manual segmentation is time-consuming. This paper utilized U-Net and its improved methods to automatically segment thyroid nodules and glands. METHODS The 5822 ultrasound images used in the experiment came from two centers, 4658 images were used as the training dataset, and 1164 images were used as the independent mixed test dataset finally. Based on U-Net, deformable-pyramid split-attention residual U-Net (DSRU-Net) by introducing ResNeSt block, atrous spatial pyramid pooling, and deformable convolution v3 was proposed. This method combined context information and extracts features of interest better, and had advantages in segmenting nodules and glands of different shapes and sizes. RESULTS DSRU-Net obtained 85.8% mean Intersection over Union, 92.5% mean dice coefficient and 94.1% nodule dice coefficient, which were increased by 1.8%, 1.3% and 1.9% compared with U-Net. CONCLUSIONS Our method is more capable of identifying and segmenting glands and nodules than the original method, as shown by the results of correlational studies.
Collapse
Affiliation(s)
- Tianlei Zheng
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
- Artificial Intelligence Unit, Department of Medical Equipment Management, Affiliated Hospital of Xuzhou Medical University, Xuzhou, 221004, China
| | - Hang Qin
- Department of Medical Equipment Management, Nanjing First Hospital, Nanjing, 221000, China
| | - Yingying Cui
- Department of Pathology, Affiliated Hospital of Xuzhou Medical University, Xuzhou, 221004, China
| | - Rong Wang
- Department of Ultrasound Medicine, Affiliated Hospital of Xuzhou Medical University, Xuzhou, 221004, China
| | - Weiguo Zhao
- Artificial Intelligence Unit, Department of Medical Equipment Management, Affiliated Hospital of Xuzhou Medical University, Xuzhou, 221004, China
| | - Shijin Zhang
- Artificial Intelligence Unit, Department of Medical Equipment Management, Affiliated Hospital of Xuzhou Medical University, Xuzhou, 221004, China
| | - Shi Geng
- Artificial Intelligence Unit, Department of Medical Equipment Management, Affiliated Hospital of Xuzhou Medical University, Xuzhou, 221004, China
| | - Lei Zhao
- Artificial Intelligence Unit, Department of Medical Equipment Management, Affiliated Hospital of Xuzhou Medical University, Xuzhou, 221004, China.
| |
Collapse
|
49
|
Zhou Z, Bian Y, Pan S, Meng Q, Zhu W, Shi F, Chen X, Shao C, Xiang D. A dual branch and fine-grained enhancement network for pancreatic tumor segmentation in contrast enhanced CT images. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
50
|
Diao Z, Jiang H, Zhou Y. Leverage prior texture information in deep learning-based liver tumor segmentation: A plug-and-play Texture-Based Auto Pseudo Label module. Comput Med Imaging Graph 2023; 106:102217. [PMID: 36958076 DOI: 10.1016/j.compmedimag.2023.102217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 01/16/2023] [Accepted: 03/04/2023] [Indexed: 03/17/2023]
Abstract
Segmenting the liver and tumor regions using CT scans is crucial for the subsequent treatment in clinical practice and radiotherapy. Recently, liver and tumor segmentation techniques based on U-Net have gained popularity. However, there are numerous varieties of liver tumors, and they differ greatly in terms of their shapes and textures. It is unreasonable to regard all liver tumors as one class for learning. Meanwhile, texture information is crucial for the identification of liver tumors. We propose a plug-and-play Texture-based Auto Pseudo Label (TAPL) module to take use of the texture information of tumors and enable the neural network actively learn the texture differences between various tumors to increase the segmentation accuracy, especially for small tumors. The TPAL module consists of two parts, texture enhancement and texture-based pseudo label generator. To highlight the regions where the texture varies significantly, we enhance the textured areas of the CT image. Based on their texture information, tumors are automatically divided into several classes by the texture-based pseudo label generator. The multi-class tumors produced by the neural network during the prediction step are combined into a single tumor label, which is then used as the outcome of the segmentation. Experiments on clinical dataset and public dataset Lits2017 show that the proposed algorithm outperforms single liver tumor label segmentation methods and is more friendly to small tumors.
Collapse
Affiliation(s)
- Zhaoshuo Diao
- Software College, Northeastern University, Shenyang 110819, China
| | - Huiyan Jiang
- Software College, Northeastern University, Shenyang 110819, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang 110819, China.
| | - Yang Zhou
- Software College, Northeastern University, Shenyang 110819, China
| |
Collapse
|