1
|
Zheng RJ, Li DL, Lin HM, Wang JF, Luo YM, Tang Y, Li F, Hu Y, Su S. Bibliometrics of artificial intelligence applications in hepatobiliary surgery from 2014 to 2024. World J Gastrointest Surg 2025; 17:104728. [DOI: 10.4240/wjgs.v17.i5.104728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/29/2024] [Revised: 01/16/2025] [Accepted: 02/18/2025] [Indexed: 05/23/2025] Open
Abstract
BACKGROUND In recent years, the rapid development of artificial intelligence (AI) in hepatobiliary surgery research has led to an increase in articles exploring its benefits. We performed a bibliometric analysis of AI applications in hepatobiliary surgery to better delineate the contemporary state of AI application in hepatobiliary surgery and potential future trajectories.
AIM To provide clinical practitioners with a reliable reference point. It offers a detailed overview of the development of AI in hepatobiliary surgery by systematically examining the contributions of authors, countries, institutions, journals, and keywords in this domain over the last 10 years.
METHODS The academic resources utilized in this study were obtained from the Web of Science Core Collection database. The search results were subsequently integrated and imported into CiteSpace and VOSviewer software for the purpose of visual analysis.
RESULTS The study analyzed 2552 publications during 2014–2024. These publications collectively garnered 32 628 citations, averaging 15.66 citations per paper. The top contributor to this field was China. The USA had the highest citation count. The author with the highest citation count was Summers RM. In terms of the number of articles published, the leading journals were Medical Physics. Excluding the subject search terms, the most frequently used keywords included “classification”, “CT and “diagnosis”.
CONCLUSION This bibliometric analysis indicates that research on AI in hepatobiliary surgery has entered a period of rapid development, particularly in the domain of disease imaging diagnostics.
Collapse
Affiliation(s)
- Ru-Jun Zheng
- Department of General Surgery (Hepatopancreatobiliary Surgery), The Affiliated Hospital of Southwest Medical University, Luzhou 646000, Sichuan Province, China
| | - Dong-Lun Li
- Department of Nephrology, University Hospital Essen, University of Duisburg-Essen, Essen 45147, Germany
| | - Hao-Min Lin
- Department of Hepatobiliary Pancreatic Surgery, Chengdu Sixth People’s Hospital, Chengdu 610000, Sichuan Province, China
| | - Jun-Feng Wang
- College of Computer Science, Sichuan University, Chengdu 610065, Sichuan Province, China
| | - Ya-Mei Luo
- School of Medical Information and Engineering, Southwest Medical University, Luzhou 646000, Sichuan Province, China
| | - Yong Tang
- School of Computer Science and Engineering, University of Electronic Science and Technology, Chengdu 611731, Sichuan Province, China
| | - Fan Li
- College of Artificial Intelligence (CUIT Shuangliu Industrial College), Chengdu University of Information Technology, Chengdu 610225, Sichuan Province, China
- The Institute of Digital Health and Medical ITA Innovation Industry, Chengdu 610225, Sichuan Province, China
| | - Yue Hu
- Department of Rehabilitation, The Affiliated Hospital of Southwest Medical University, Luzhou 646000, Sichuan Province, China
- Rehabilitation Medicine and Engineering Key Laboratory of Luzhou, Luzhou 646000, Sichuan Province, China
| | - Song Su
- Department of General Surgery (Hepatopancreatobiliary Surgery), The Affiliated Hospital of Southwest Medical University, Luzhou 646000, Sichuan Province, China
- Academician (Expert) Workstation of Sichuan Province, Metabolic Hepatobiliary and Pancreatic Diseases Key Laboratory of Luzhou City, The Affiliated Hospital of Southwest Medical University, Luzhou 646000, Sichuan Province, China
| |
Collapse
|
2
|
Zhu Y, Wang X, Liu T, Fu Y. Multi-perspective dynamic consistency learning for semi-supervised medical image segmentation. Sci Rep 2025; 15:18266. [PMID: 40415094 DOI: 10.1038/s41598-025-03124-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2024] [Accepted: 05/19/2025] [Indexed: 05/27/2025] Open
Abstract
Semi-supervised learning (SSL) is an effective method for medical image segmentation as it alleviates the dependence on clinical pixel-level annotations. Among the SSL methods, pseudo-labels and consistency regularization play a key role as the dominant paradigm. However, current consistency regularization methods based on shared encoder structures are prone to trap the model in cognitive bias, which impairs the segmentation performance. Furthermore, traditional fixed-threshold-based pseudo-label selection methods lack the utilization of low-confidence pixels, making the model's initial segmentation capability insufficient, especially for confusing regions. To this end, we propose a multi-perspective dynamic consistency (MPDC) framework to mitigate model cognitive bias and to fully utilize the low-confidence pixels. Specially, we propose a novel multi-perspective collaborative learning strategy that encourages the sub-branch networks to learn discriminative features from multiple perspectives, thus avoiding the problem of model cognitive bias and enhancing boundary perception. In addition, we further employ a dynamic decoupling consistency scheme to fully utilize low-confidence pixels. By dynamically adjusting the threshold, more pseudo-labels are involved in the early stages of training. Extensive experiments on several challenging medical image segmentation datasets show that our method achieves state-of-the-art performance, especially on boundaries, with significant improvements.
Collapse
Affiliation(s)
- Yongfa Zhu
- College of Computer Science and Technology, Beihua University, Jilin, 132013, China
| | - Xue Wang
- College of Computer Science and Technology, Beihua University, Jilin, 132013, China.
| | - Taihui Liu
- College of Computer Science and Technology, Beihua University, Jilin, 132013, China
| | - Yongkang Fu
- College of Computer Science and Technology, Beihua University, Jilin, 132013, China
| |
Collapse
|
3
|
Yang B, Zhang J, Lyu Y, Zhang J. Automatic computed tomography image segmentation method for liver tumor based on a modified tokenized multilayer perceptron and attention mechanism. Quant Imaging Med Surg 2025; 15:2385-2404. [PMID: 40160629 PMCID: PMC11948385 DOI: 10.21037/qims-24-2132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2024] [Accepted: 01/23/2025] [Indexed: 04/02/2025]
Abstract
Background The automatic medical image segmentation of liver and tumor plays a pivotal role in the clinical diagnosis of liver diseases. A number of effective methods based on deep neural networks, including convolutional neural networks (CNNs) and vision transformer (ViT) have been developed. However, these networks primarily focus on enhancing segmentation accuracy while often overlooking the segmentation speed, which is vital for rapid diagnosis in clinical settings. Therefore, we aimed to develop an automatic computed tomography (CT) image segmentation method for liver tumors that reduces inference time while maintaining accuracy, as rigorously validated through experimental studies. Methods We developed a U-shaped network enhanced by a multiscale attention module and attention gates, aimed at efficient CT image segmentation of liver tumors. In this network, a modified tokenized multilayer perceptron (MLP) block is first leveraged to reduce the feature dimensions and facilitate information interaction between adjacent patches so that the network can learn the key features of tumors with less computational complexity. Second, attention gates are added into the skip connections between the encoder and decoder, emphasizing feature expression in relevant regions and enabling the network to focus more on liver tumor features. Finally, a multiscale attention mechanism autonomously adjusts weights for each scale, allowing the network to adapt effectively to varying sizes of liver tumors. Our methodology was validated via the Liver Tumor Segmentation 2017 (LiTS17) public dataset. The data from this database are from seven global clinical sites. All data are anonymized, and the images have been prescreened to ensure the absence of personal identifiers. Standard metrics were used to evaluate the performance of the model. Results The 21 cases were included for testing. The proposed network attained a Dice score of 0.713 [95% confidence interval (CI): 0.592-0.834], a volumetric overlap error of 0.39 (95% CI: 0.17-0.61), a relative volume difference score of 0.19 (95% CI: -0.37 to 0.31), an average symmetric surface distance of 2.04 mm (95% CI: 0.89-4.19), a maximum surface distance of 9.42 mm (95% CI: 6.97-19.87), and an inference time of 26 ms on average for liver tumor segmentation. Conclusions The proposed network demonstrated efficient liver tumor segmentation performance with less inference time. Our findings contribute to the application of neural networks in rapid clinical diagnosis and treatment.
Collapse
Affiliation(s)
- Bo Yang
- College of Mechanical Engineering, Donghua University, Shanghai, China
| | - Jie Zhang
- Institute of Artificial Intelligence, Donghua University, Shanghai, China
| | - Youlong Lyu
- Institute of Artificial Intelligence, Donghua University, Shanghai, China
| | - Jun Zhang
- College of Information Science and Technology, Donghua University, Shanghai, China
| |
Collapse
|
4
|
Dautkulova A, Aider OA, Teulière C, Coste J, Chaix R, Ouachik O, Pereira B, Lemaire JJ. Automated segmentation of deep brain structures from Inversion-Recovery MRI. Comput Med Imaging Graph 2025; 120:102488. [PMID: 39787737 DOI: 10.1016/j.compmedimag.2024.102488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 09/05/2024] [Accepted: 12/30/2024] [Indexed: 01/12/2025]
Abstract
Methods for the automated segmentation of brain structures are a major subject of medical research. The small structures of the deep brain have received scant attention, notably for lack of manual delineations by medical experts. In this study, we assessed an automated segmentation of a novel clinical dataset containing White Matter Attenuated Inversion-Recovery (WAIR) MRI images and five manually segmented structures (substantia nigra (SN), subthalamic nucleus (STN), red nucleus (RN), mammillary body (MB) and mammillothalamic fascicle (MT-fa)) in 53 patients with severe Parkinson's disease. T1 and DTI images were additionally used. We also assessed the reorientation of DTI diffusion vectors with reference to the ACPC line. A state-of-the-art nnU-Net method was trained and tested on subsets of 38 and 15 image datasets respectively. We used Dice similarity coefficient (DSC), 95% Hausdorff distance (95HD), and volumetric similarity (VS) as metrics to evaluate network efficiency in reproducing manual contouring. Random-effects models statistically compared values according to structures, accounting for between- and within-participant variability. Results show that WAIR significantly outperformed T1 for DSC (0.739 ± 0.073), 95HD (1.739 ± 0.398), and VS (0.892 ± 0.044). The DSC values for automated segmentation of MB, RN, SN, STN, and MT-fa decreased in that order, in line with the increasing complexity observed in manual segmentation. Based on training results, the reorientation of DTI vectors improved the automated segmentation.
Collapse
Affiliation(s)
- Aigerim Dautkulova
- Université Clermont Auvergne, Clermont Auvergne INP, CNRS, Institut Pascal, F-63000 Clermont-Ferrand, France.
| | - Omar Ait Aider
- Université Clermont Auvergne, Clermont Auvergne INP, CNRS, Institut Pascal, F-63000 Clermont-Ferrand, France
| | - Céline Teulière
- Université Clermont Auvergne, Clermont Auvergne INP, CNRS, Institut Pascal, F-63000 Clermont-Ferrand, France
| | - Jérôme Coste
- Université Clermont Auvergne, Clermont Auvergne INP, CNRS, Institut Pascal, F-63000 Clermont-Ferrand, France; Université Clermont Auvergne, CNRS, CHU Clermont-Ferrand, Clermont Auvergne INP, Institut Pascal, F-63000 Clermont-Ferrand, France
| | - Rémi Chaix
- Université Clermont Auvergne, Clermont Auvergne INP, CNRS, Institut Pascal, F-63000 Clermont-Ferrand, France; Université Clermont Auvergne, CNRS, CHU Clermont-Ferrand, Clermont Auvergne INP, Institut Pascal, F-63000 Clermont-Ferrand, France
| | - Omar Ouachik
- Université Clermont Auvergne, Clermont Auvergne INP, CNRS, Institut Pascal, F-63000 Clermont-Ferrand, France
| | - Bruno Pereira
- Direction de la Recherche et de l'Innovation, CHU Clermont-Ferrand, F-63000 Clermont-Ferrand, France
| | - Jean-Jacques Lemaire
- Université Clermont Auvergne, Clermont Auvergne INP, CNRS, Institut Pascal, F-63000 Clermont-Ferrand, France; Université Clermont Auvergne, CNRS, CHU Clermont-Ferrand, Clermont Auvergne INP, Institut Pascal, F-63000 Clermont-Ferrand, France
| |
Collapse
|
5
|
Tang R, Zhao H, Tong Y, Mu R, Wang Y, Zhang S, Zhao Y, Wang W, Zhang M, Liu Y, Gao J. A frequency attention-embedded network for polyp segmentation. Sci Rep 2025; 15:4961. [PMID: 39929863 PMCID: PMC11811025 DOI: 10.1038/s41598-025-88475-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Accepted: 01/28/2025] [Indexed: 02/13/2025] Open
Abstract
Gastrointestinal polyps are observed and treated under endoscopy, so there presents significant challenges to advance endoscopy imaging segmentation of polyps. Current methodologies often falter in distinguishing complex polyp structures within diverse (mucosal) tissue environments. In this paper, we propose the Frequency Attention-Embedded Network (FAENet), a novel approach leveraging frequency-based attention mechanisms to enhance polyp segmentation accuracy significantly. FAENet ingeniously segregates and processes image data into high and low-frequency components, enabling precise delineation of polyp boundaries and internal structures by integrating intra-component and cross-component attention mechanisms. This method not only preserves essential edge details but also refines the learned representation attentively, ensuring robust segmentation across varied imaging conditions. Comprehensive evaluations on two public datasets, Kvasir-SEG and CVC-ClinicDB, demonstrate FAENet's superiority over several state-of-the-art models in terms of Dice coefficient, Intersection over Union (IoU), sensitivity, and specificity. The results affirm that FAENet's advanced attention mechanisms significantly improve the segmentation quality, outperforming traditional and contemporary techniques. FAENet's success indicates its potential to revolutionize polyp segmentation in clinical practices, fostering diagnosis and efficient treatment of gastrointestinal polyps.
Collapse
Affiliation(s)
- Rui Tang
- Department of Orthopedics, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Hejing Zhao
- Research Center on Flood and Drought Disaster Reduction of Ministry of Water Resource, China Institute of Water Resources and Hydropower Research, Beijing, 100038, China
- Water History Department, China Institute of Water Resources and Hydropower Research, Beijing, 100038, China
| | - Yao Tong
- School of Artificial Intelligence and Information Technology, Nanjing University of Chinese Medicine, Nanjing, 210023, China
- Jiangsu Province Engineering Research Center of TCM Intelligence Health Service, Nanjing University of Chinese Medicine, Nanjing, 210023, China
| | - Ruihui Mu
- College of Computer and Information, Xinxiang University, Xinxiang, 453000, China
| | - Yuqiang Wang
- Department of Orthopedics, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Shuhao Zhang
- Department of Orthopedics, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Yao Zhao
- Department of Orthopedics, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Weidong Wang
- Department of Orthopedics, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Min Zhang
- Department of Orthopedics, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China
| | - Yilin Liu
- Department of Orthopedics, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450000, China.
| | - Jianbo Gao
- Department of Radiology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China.
| |
Collapse
|
6
|
Ghobadi V, Ismail LI, Wan Hasan WZ, Ahmad H, Ramli HR, Norsahperi NMH, Tharek A, Hanapiah FA. Challenges and solutions of deep learning-based automated liver segmentation: A systematic review. Comput Biol Med 2025; 185:109459. [PMID: 39642700 DOI: 10.1016/j.compbiomed.2024.109459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 11/12/2024] [Accepted: 11/19/2024] [Indexed: 12/09/2024]
Abstract
The liver is one of the vital organs in the body. Precise liver segmentation in medical images is essential for liver disease treatment. The deep learning-based liver segmentation process faces several challenges. This research aims to analyze the challenges of liver segmentation in prior studies and identify the modifications made to network models and other enhancements implemented by researchers to tackle each challenge. In total, 88 articles from Scopus and ScienceDirect databases published between January 2016 and January 2022 have been studied. The liver segmentation challenges are classified into five main categories, each containing some subcategories. For each challenge, the proposed technique to overcome the challenge is investigated. The provided report details the authors, publication years, dataset types, imaging technologies, and evaluation metrics of all references for comparison. Additionally, a summary table outlines the challenges and solutions.
Collapse
Affiliation(s)
- Vahideh Ghobadi
- Faculty of Engineering, Universiti Putra Malaysia, Serdang, 43400, Selangor, Malaysia.
| | - Luthffi Idzhar Ismail
- Faculty of Engineering, Universiti Putra Malaysia, Serdang, 43400, Selangor, Malaysia.
| | - Wan Zuha Wan Hasan
- Faculty of Engineering, Universiti Putra Malaysia, Serdang, 43400, Selangor, Malaysia.
| | - Haron Ahmad
- KPJ Specialist Hospital, Damansara Utama, Petaling Jaya, 47400, Selangor, Malaysia.
| | - Hafiz Rashidi Ramli
- Faculty of Engineering, Universiti Putra Malaysia, Serdang, 43400, Selangor, Malaysia.
| | | | - Anas Tharek
- Hospital Sultan Abdul Aziz Shah, University Putra Malaysia, Serdang, 43400, Selangor, Malaysia.
| | - Fazah Akhtar Hanapiah
- Faculty of Medicine, Universiti Teknologi MARA, Damansara Utama, Sungai Buloh, 47000, Selangor, Malaysia.
| |
Collapse
|
7
|
Zhu H, Shu S, Zhang J. A cascaded FAS-UNet+ framework with iterative optimization strategy for segmentation of organs at risk. Med Biol Eng Comput 2025; 63:429-446. [PMID: 39365519 DOI: 10.1007/s11517-024-03208-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 09/17/2024] [Indexed: 10/05/2024]
Abstract
Segmentation of organs at risks (OARs) in the thorax plays a critical role in radiation therapy for lung and esophageal cancer. Although automatic segmentation of OARs has been extensively studied, it remains challenging due to the varying sizes and shapes of organs, as well as the low contrast between the target and background. This paper proposes a cascaded FAS-UNet+ framework, which integrates convolutional neural networks and nonlinear multi-grid theory to solve a modified Mumford-shah model for segmenting OARs. This framework is equipped with an enhanced iteration block, a coarse-to-fine multiscale architecture, an iterative optimization strategy, and a model ensemble technique. The enhanced iteration block aims to extract multiscale features, while the cascade module is used to refine coarse segmentation predictions. The iterative optimization strategy improves the network parameters to avoid unfavorable local minima. An efficient data augmentation method is also developed to train the network, which significantly improves its performance. During the prediction stage, a weighted ensemble technique combines predictions from multiple models to refine the final segmentation. The proposed cascaded FAS-UNet+ framework was evaluated on the SegTHOR dataset, and the results demonstrate significant improvements in Dice score and Hausdorff Distance (HD). The Dice scores were 95.22%, 95.68%, and HD values were 0.1024, and 0.1194 for the segmentations of the aorta and heart in the official unlabeled dataset, respectively. Our code and trained models are available at https://github.com/zhuhui100/C-FASUNet-plus .
Collapse
Affiliation(s)
- Hui Zhu
- School of Mathematics and Computational Science, Xiangtan University, Xiangtan, 411105, China
- School of Computational Science and Electronics, Hunan Institute of Engineering, Xiangtan, 411104, China
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan, Hunan, 411105, China
| | - Shi Shu
- School of Mathematics and Computational Science, Xiangtan University, Xiangtan, 411105, China
- Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan, Hunan, 411105, China
| | - Jianping Zhang
- School of Mathematics and Computational Science, Xiangtan University, Xiangtan, 411105, China.
- National Center for Applied Mathematics in Hunan, Xiangtan, Hunan, 411105, China.
| |
Collapse
|
8
|
Chu J, Liu W, Tian Q, Lu W. PFPRNet: A Phase-Wise Feature Pyramid With Retention Network for Polyp Segmentation. IEEE J Biomed Health Inform 2025; 29:1137-1150. [PMID: 40030242 DOI: 10.1109/jbhi.2024.3500026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
Abstract
Early detection of colonic polyps is crucial for the prevention and diagnosis of colorectal cancer. Currently, deep learning-based polyp segmentation methods have become mainstream and achieved remarkable results. Acquiring a large number of labeled data is time-consuming and labor-intensive, and meanwhile the presence of numerous similar wrinkles in polyp images also hampers model prediction performance. In this paper, we propose a novel approach called Phase-wise Feature Pyramid with Retention Network (PFPRNet), which leverages a pre-trained Transformer-based Encoder to obtain multi-scale feature maps. A Phase-wise Feature Pyramid with Retention Decoder is designed to gradually integrate global features into local features and guide the model's attention towards key regions. Additionally, our custom Enhance Perception module enables capturing image information from a broader perspective. Finally, we introduce an innovative Low-layer Retention module as an alternative to Transformer for more efficient global attention modeling. Evaluation results on several widely-used polyp segmentation datasets demonstrate that our proposed method has strong learning ability and generalization capability, and outperforms the state-of-the-art approaches.
Collapse
|
9
|
Guo T, Luan J, Gao J, Liu B, Shen T, Yu H, Ma G, Wang K. Computer-aided diagnosis of pituitary microadenoma on dynamic contrast-enhanced MRI based on spatio-temporal features. EXPERT SYSTEMS WITH APPLICATIONS 2025; 260:125414. [DOI: 10.1016/j.eswa.2024.125414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2025]
|
10
|
Wang Z, Shi C, Wong C, Oderinde SM, Watkins WT, Qing K, Liu B, Williams TM, Liu A, Han C. Comparison of Deep Learning-Based Auto-Segmentation Results on Daily Kilovoltage, Megavoltage, and Cone Beam CT Images in Image-Guided Radiotherapy. Technol Cancer Res Treat 2025; 24:15330338251344198. [PMID: 40397131 PMCID: PMC12099101 DOI: 10.1177/15330338251344198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2024] [Revised: 04/09/2025] [Accepted: 05/05/2025] [Indexed: 05/22/2025] Open
Abstract
IntroductionThis study aims to evaluate auto-segmentation results using deep learning-based auto-segmentation models on different online CT imaging modalities in image-guided radiotherapy.MethodsPhantom studies were first performed to benchmark image quality. Daily CT images for sixty patients were retrospectively retrieved from fan-beam kilovoltage CT (kVCT), kV cone-beam CT (kV-CBCT), and megavoltage CT (MVCT) scans. For each imaging modality, half of the patients received CT scans in the pelvic region, while the other half in the thoracic region. Deep learning auto-segmentation models using a convolutional neural network algorithm were used to generate organs-at-risk contours. Quantitative metrics were calculated to compare auto-segmentation results with manual contours.ResultsThe auto-segmentation contours on kVCT images showed statistically significant difference in Dice similarity coefficient (DSC), Jaccard similarity coefficient, sensitivity index, inclusiveness index, and the 95th percentile Hausdorff distance, compared to those on kV-CBCT and MVCT images for most major organs. In the pelvic region, the largest difference in DSC was observed for the bowel volume with an average DSC of 0.84 ± 0.05, 0.35 ± 0.23, and 0.48 ± 0.27 for kVCT, kV-CBCT, and MVCT images, respectively (p-value < 0.05); in the thoracic region, the largest difference in DSC was found for the esophagus with an average DSC of 0.63 ± 0.16, 0.18 ± 0.13, and 0.22 ± 0.08 for kVCT, kV-CBCT, and MVCT images, respectively (p-value < 0.05).ConclusionDeep learning-based auto-segmentation models showed better agreement with manual contouring when using kVCT images compared to kV-CBCT or MVCT images. However, manual correction remains necessary after auto-segmentation with all imaging modalities, particularly for organs with limited contrast from surrounding tissues. These findings underscore the potential and limits in applying deep learning-based auto-segmentation models for adaptive radiotherapy.
Collapse
Affiliation(s)
- Zhixing Wang
- Department of Radiation Oncology, City of Hope, Duarte, CA, USA
| | - Chengyu Shi
- Department of Radiation Oncology, City of Hope, Duarte, CA, USA
| | - Carson Wong
- Department of Radiation Oncology, City of Hope, Duarte, CA, USA
| | | | | | - Kun Qing
- Department of Radiation Oncology, City of Hope, Duarte, CA, USA
| | - Bo Liu
- Department of Radiation Oncology, City of Hope, Duarte, CA, USA
| | | | - An Liu
- Department of Radiation Oncology, City of Hope, Duarte, CA, USA
| | - Chunhui Han
- Department of Radiation Oncology, City of Hope, Duarte, CA, USA
| |
Collapse
|
11
|
Tiraboschi C, Parenti F, Sangalli F, Resovi A, Belotti D, Lanzarone E. Automatic Segmentation of Metastatic Livers by Means of U-Net-Based Procedures. Cancers (Basel) 2024; 16:4159. [PMID: 39766059 PMCID: PMC11674041 DOI: 10.3390/cancers16244159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2024] [Revised: 11/26/2024] [Accepted: 12/08/2024] [Indexed: 01/11/2025] Open
Abstract
Background: The liver is one of the most common sites for the spread of pancreatic ductal adenocarcinoma (PDAC) cells, with metastases present in about 80% of patients. Clinical and preclinical studies of PDAC require quantification of the liver's metastatic burden from several acquired images, which can benefit from automatic image segmentation tools. Methods: We developed three neural networks based on U-net architecture to automatically segment the healthy liver area (HL), the metastatic liver area (MLA), and liver metastases (LM) in micro-CT images of a mouse model of PDAC with liver metastasis. Three alternative U-nets were trained for each structure to be segmented following appropriate image preprocessing and the one with the highest performance was then chosen and applied for each case. Results: Good performance was achieved, with accuracy of 92.6%, 88.6%, and 91.5%, specificity of 95.5%, 93.8%, and 99.9%, Dice of 71.6%, 74.4%, and 29.9%, and negative predicted value (NPV) of 97.9%, 91.5%, and 91.5% on the pilot validation set for the chosen HL, MLA, and LM networks, respectively. Conclusions: The networks provided good performance and advantages in terms of saving time and ensuring reproducibility.
Collapse
Affiliation(s)
- Camilla Tiraboschi
- Department of Management, Information and Production Engineering, University of Bergamo, 24044 Dalmine, BG, Italy
| | - Federica Parenti
- Department of Management, Information and Production Engineering, University of Bergamo, 24044 Dalmine, BG, Italy
| | - Fabio Sangalli
- Department of Biomedical Engineering, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, 24126 Bergamo, BG, Italy
| | - Andrea Resovi
- Department of Oncology, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, 24126 Bergamo, BG, Italy
| | - Dorina Belotti
- Department of Oncology, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, 24126 Bergamo, BG, Italy
| | - Ettore Lanzarone
- Department of Management, Information and Production Engineering, University of Bergamo, 24044 Dalmine, BG, Italy
| |
Collapse
|
12
|
Gul S, Khan MS, Hossain MSA, Chowdhury MEH, Sumon MSI. A Comparative Study of Decoders for Liver and Tumor Segmentation Using a Self-ONN-Based Cascaded Framework. Diagnostics (Basel) 2024; 14:2761. [PMID: 39682669 DOI: 10.3390/diagnostics14232761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2024] [Revised: 11/22/2024] [Accepted: 12/06/2024] [Indexed: 12/18/2024] Open
Abstract
Background/Objectives: Accurate liver and tumor detection and segmentation are crucial in diagnosis of early-stage liver malignancies. As opposed to manual interpretation, which is a difficult and time-consuming process, accurate tumor detection using a computer-aided diagnosis system can save both time and human efforts. Methods: We propose a cascaded encoder-decoder technique based on self-organized neural networks, which is a recent variant of operational neural networks (ONNs), for accurate segmentation and identification of liver tumors. The first encoder-decoder CNN segments the liver. For generating the liver region of interest, the segmented liver mask is placed over the input computed tomography (CT) image and then fed to the second Self-ONN model for tumor segmentation. For further investigation the other three distinct encoder-decoder architectures U-Net, feature pyramid networks (FPNs), and U-Net++, have also been investigated by altering the backbone at the encoders utilizing ResNet and DenseNet variants for transfer learning. Results: For the liver segmentation task, Self-ONN with a ResNet18 backbone has achieved a dice similarity coefficient score of 98.182% and an intersection over union of 97.436%. Tumor segmentation with Self-ONN with the DenseNet201 encoder resulted in an outstanding DSC of 92.836% and IoU of 91.748%. Conclusions: The suggested method is capable of precisely locating liver tumors of various sizes and shapes, including tiny infection patches that were said to be challenging to find in earlier research.
Collapse
Affiliation(s)
- Sidra Gul
- Department of Computer Systems Engineering, University of Engineering and Technology, Peshawar 25000, Pakistan
- Artificial Intelligence in Healthcare, Intelligent Information Processing Lab, National Center of Artificial Intelligence, Peshawar 25000, Pakistan
| | - Muhammad Salman Khan
- Department of Electrical Engineering, College of Engineering, Qatar University, Doha 2713, Qatar
| | - Md Sakib Abrar Hossain
- Department of Electrical Engineering, College of Engineering, Qatar University, Doha 2713, Qatar
| | - Muhammad E H Chowdhury
- Department of Electrical Engineering, College of Engineering, Qatar University, Doha 2713, Qatar
| | - Md Shaheenur Islam Sumon
- Department of Electrical Engineering, College of Engineering, Qatar University, Doha 2713, Qatar
| |
Collapse
|
13
|
Zhang T, Liu Y, Zhao Q, Xue G, Shen H. Edge-guided multi-scale adaptive feature fusion network for liver tumor segmentation. Sci Rep 2024; 14:28370. [PMID: 39551810 PMCID: PMC11570674 DOI: 10.1038/s41598-024-79379-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2024] [Accepted: 11/08/2024] [Indexed: 11/19/2024] Open
Abstract
Automated segmentation of liver tumors on CT scans is essential for aiding diagnosis and assessing treatment. Computer-aided diagnosis can reduce the costs and errors associated with manual processes and ensure the provision of accurate and reliable clinical assessments. However, liver tumors in CT images vary significantly in size and have fuzzy boundaries, making it difficult for existing methods to achieve accurate segmentation. Therefore, this paper proposes MAEG-Net, a multi-scale adaptive feature fusion liver tumor segmentation network based on edge guidance. Specifically, we design a multi-scale adaptive feature fusion module that effectively incorporates multi-scale information to better guide the segmentation of tumors of different sizes. Additionally, to address the problem of blurred tumor boundaries in images, we introduce an edge-aware guidance module to improve the model's feature learning ability under these conditions. Evaluation results on the liver tumor dataset (LiTS2017) show that our method achieves a Dice coefficient of 71.84% and a VOE of 38.64%, demonstrating the best performance for liver tumor segmentation in CT images.
Collapse
Affiliation(s)
- Tiange Zhang
- School of Digital and Intelligent Industry, Inner Mongolia University of Science & Technology, Baotou, 014010, China
| | - Yuefeng Liu
- School of Digital and Intelligent Industry, Inner Mongolia University of Science & Technology, Baotou, 014010, China.
| | - Qiyan Zhao
- School of Digital and Intelligent Industry, Inner Mongolia University of Science & Technology, Baotou, 014010, China
| | - Guoyue Xue
- School of Digital and Intelligent Industry, Inner Mongolia University of Science & Technology, Baotou, 014010, China
| | - Hongyu Shen
- School of Digital and Intelligent Industry, Inner Mongolia University of Science & Technology, Baotou, 014010, China
| |
Collapse
|
14
|
Huang W, Zhang L, Wang Z, Wang L. Exploring Inherent Consistency for Semi-Supervised Anatomical Structure Segmentation in Medical Imaging. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:3731-3741. [PMID: 38743533 DOI: 10.1109/tmi.2024.3400840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Due to the exorbitant expense of obtaining labeled data in the field of medical image analysis, semi-supervised learning has emerged as a favorable method for the segmentation of anatomical structures. Although semi-supervised learning techniques have shown great potential in this field, existing methods only utilize image-level spatial consistency to impose unsupervised regularization on data in label space. Considering that anatomical structures often possess inherent anatomical properties that have not been focused on in previous works, this study introduces the inherent consistency into semi-supervised anatomical structure segmentation. First, the prediction and the ground-truth are projected into an embedding space to obtain latent representations that encapsulate the inherent anatomical properties of the structures. Then, two inherent consistency constraints are designed to leverage these inherent properties by aligning these latent representations. The proposed method is plug-and-play and can be seamlessly integrated with existing methods, thereby collaborating to improve segmentation performance and enhance the anatomical plausibility of the results. To evaluate the effectiveness of the proposed method, experiments are conducted on three public datasets (ACDC, LA, and Pancreas). Extensive experimental results demonstrate that the proposed method exhibits good generalizability and outperforms several state-of-the-art methods.
Collapse
|
15
|
Zhang Z, Gao J, Li S, Wang H. RMCNet: A Liver Cancer Segmentation Network Based on 3D Multi-Scale Convolution, Attention, and Residual Path. Bioengineering (Basel) 2024; 11:1073. [PMID: 39593733 PMCID: PMC11591158 DOI: 10.3390/bioengineering11111073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2024] [Revised: 10/22/2024] [Accepted: 10/24/2024] [Indexed: 11/28/2024] Open
Abstract
Abdominal CT images are important clues for diagnosing liver cancer lesions. However, liver cancer presents challenges such as significant differences in tumor size, shape, and location, which can affect segmentation accuracy. To address these challenges, we propose an end-to-end 3D segmentation algorithm, RMCNet. In the shallow encoding part of RMCNet, we incorporated a 3D multiscale convolution (3D-Multiscale Convolution) module to more effectively extract tumors of varying sizes. Moreover, the convolutional block attention module (CBAM) is used in the encoding part to help the model focus on both the shape and location of tumors. Additionally, a residual path is introduced in each encoding layer to further enrich the extracted feature maps. Our method achieved DSC scores of 76.56% and 72.96%, JCC scores of 75.82% and 71.25%, HD values of 11.07 mm and 17.06 mm, and ASD values of 2.54 mm and 10.51 mm on the MICCAI 2017 Liver Tumor Segmentation public dataset and the 3Dircadb-01 public dataset, respectively. Compared to other methods, RMCNet demonstrates superior segmentation performance and stronger generalization capability.
Collapse
Affiliation(s)
- Zerui Zhang
- School of Bioengineering, Chongqing University, Chongqing 400044, China;
| | - Jianyun Gao
- Medical Device Institute, Shenyang Pharmaceutical University, Benxi 117004, China;
| | - Shu Li
- Institute for Medical Device Control, National Institutes for Food and Drug Control, Beijing 102629, China
| | - Hao Wang
- Institute for Medical Device Control, National Institutes for Food and Drug Control, Beijing 102629, China
| |
Collapse
|
16
|
Appati JK, Yirenkyi IA. A cascading approach using se-resnext, resnet and feature pyramid network for kidney tumor segmentation. Heliyon 2024; 10:e38612. [PMID: 39430467 PMCID: PMC11489355 DOI: 10.1016/j.heliyon.2024.e38612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Revised: 09/25/2024] [Accepted: 09/26/2024] [Indexed: 10/22/2024] Open
Abstract
Accurate segmentation of kidney tumors in CT images is very important in the diagnosis of kidney cancer. Automatic semantic segmentation of the kidney tumor has shown promising results towards developing advance surgical planning techniques in the treatment of kidney tumor. However, the relatively small size of kidney tumor volume in comparison to the overall kidney volume, and its irregular distribution and shape makes it difficult to accurately segment the tumors. In addressing this issue, we proposed a coarse to fine segmentation which leverages on transfer learning using SE-ResNeXt model for the initial segmentation and ResNet and Feature Pyramid Network for the final segmentation. The processes are related and the output of the initial results was used for the final training. We trained and evaluated our method on the KITS19 dataset and achieved a dice score of 0.7388 and Jaccard score 0.7321 for the final segmentation demonstrating promising results when compared to other approaches.
Collapse
|
17
|
Delmoral JC, R S Tavares JM. Semantic Segmentation of CT Liver Structures: A Systematic Review of Recent Trends and Bibliometric Analysis : Neural Network-based Methods for Liver Semantic Segmentation. J Med Syst 2024; 48:97. [PMID: 39400739 PMCID: PMC11473507 DOI: 10.1007/s10916-024-02115-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Accepted: 10/02/2024] [Indexed: 10/15/2024]
Abstract
The use of artificial intelligence (AI) in the segmentation of liver structures in medical images has become a popular research focus in the past half-decade. The performance of AI tools in screening for this task may vary widely and has been tested in the literature in various datasets. However, no scientometric report has provided a systematic overview of this scientific area. This article presents a systematic and bibliometric review of recent advances in neuronal network modeling approaches, mainly of deep learning, to outline the multiple research directions of the field in terms of algorithmic features. Therefore, a detailed systematic review of the most relevant publications addressing fully automatic semantic segmenting liver structures in Computed Tomography (CT) images in terms of algorithm modeling objective, performance benchmark, and model complexity is provided. The review suggests that fully automatic hybrid 2D and 3D networks are the top performers in the semantic segmentation of the liver. In the case of liver tumor and vasculature segmentation, fully automatic generative approaches perform best. However, the reported performance benchmark indicates that there is still much to be improved in segmenting such small structures in high-resolution abdominal CT scans.
Collapse
Affiliation(s)
- Jessica C Delmoral
- Instituto de Ciência e Inovação em Engenharia Mecânica e Engenharia Industrial, Faculdade de Engenharia, Universidade do Porto, Rua Dr. Roberto Frias, s/n, 4200-465, Porto, Portugal
| | - João Manuel R S Tavares
- Instituto de Ciência e Inovação em Engenharia Mecânica e Engenharia Industrial, Departamento de Engenharia Mecânica, Faculdade de Engenharia, Universidade do Porto, Rua Dr. Roberto Frias, s/n, 4200-465, Porto, Portugal.
| |
Collapse
|
18
|
Zhang W, Jin X, Wang C, Jiang S, Yan J, Li Y. Spontaneous rupture and hemorrhage of renal epithelioid angiomyolipoma misdiagnosed to renal carcinoma: a case report. J Med Case Rep 2024; 18:425. [PMID: 39261965 PMCID: PMC11391642 DOI: 10.1186/s13256-024-04743-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Accepted: 08/07/2024] [Indexed: 09/13/2024] Open
Abstract
BACKGROUND Renal epithelioid angiomyolipoma is a rare and unique subtype of classic angiomyolipoma, characterized by the presence of epithelioid cells. It often presents with nonspecific symptoms and can be easily misdiagnosed due to its similarity to renal cell carcinoma and classic angiomyolipoma in clinical and radiological features. This case report is significant for its demonstration of the challenges in diagnosing epithelioid angiomyolipoma and its emphasis on the importance of accurate differentiation from renal cell carcinoma and classic angiomyolipoma. CASE PRESENTATION A 58-year-old Asian female presented with sudden left flank pain and was initially diagnosed with a malignant renal tumor based on imaging studies. She underwent laparoscopic radical nephrectomy, and postoperative histopathology confirmed the diagnosis of epithelioid angiomyolipoma. The patient recovered well and is currently in good health with regular follow-ups. This case highlights the diagnostic challenges, with a focus on the clinical, radiological, and histopathological features that eventually led to the identification of epithelioid angiomyolipoma. CONCLUSIONS Epithelioid angiomyolipoma is easily misdiagnosed in clinical work. When dealing with these patients, it is necessary to make a comprehensive diagnosis based on clinical symptoms, imaging manifestations, and pathological characteristics.
Collapse
Affiliation(s)
- Wenhao Zhang
- Department of Urology, The First Affiliated Hospital of Zhejiang Chinese Medical University (Zhejiang Provincial Hospital of Chinese Medicine), Hangzhou, 310006, Zhejiang, People's Republic of China
| | - Xiaodong Jin
- Department of Urology, The First Affiliated Hospital of Zhejiang Chinese Medical University (Zhejiang Provincial Hospital of Chinese Medicine), Hangzhou, 310006, Zhejiang, People's Republic of China
| | - Chundan Wang
- Department of Pathology, The First Affiliated Hospital of Zhejiang Chinese Medical University (Zhejiang Provincial Hospital of Chinese Medicine), Hangzhou, 310006, Zhejiang, People's Republic of China
| | - Shaobo Jiang
- Department of Urology, The First Affiliated Hospital of Zhejiang Chinese Medical University (Zhejiang Provincial Hospital of Chinese Medicine), Hangzhou, 310006, Zhejiang, People's Republic of China
| | - Jiasheng Yan
- Department of Urology, The First Affiliated Hospital of Zhejiang Chinese Medical University (Zhejiang Provincial Hospital of Chinese Medicine), Hangzhou, 310006, Zhejiang, People's Republic of China
| | - Yubing Li
- Department of Urology, The First Affiliated Hospital of Zhejiang Chinese Medical University (Zhejiang Provincial Hospital of Chinese Medicine), Hangzhou, 310006, Zhejiang, People's Republic of China.
| |
Collapse
|
19
|
Lai J, Luo Z, Liu J, Hu H, Jiang H, Liu P, He L, Cheng W, Ren W, Wu Y, Piao JG, Wu Z. Charged Gold Nanoparticles for Target Identification-Alignment and Automatic Segmentation of CT Image-Guided Adaptive Radiotherapy in Small Hepatocellular Carcinoma. NANO LETTERS 2024; 24:10614-10623. [PMID: 39046153 PMCID: PMC11363118 DOI: 10.1021/acs.nanolett.4c02823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Revised: 07/19/2024] [Accepted: 07/22/2024] [Indexed: 07/25/2024]
Abstract
Because of the challenges posed by anatomical uncertainties and the low resolution of plain computed tomography (CT) scans, implementing adaptive radiotherapy (ART) for small hepatocellular carcinoma (sHCC) using artificial intelligence (AI) faces obstacles in tumor identification-alignment and automatic segmentation. The current study aims to improve sHCC imaging for ART using a gold nanoparticle (Au NP)-based CT contrast agent to enhance AI-driven automated image processing. The synthesized charged Au NPs demonstrated notable in vitro aggregation, low cytotoxicity, and minimal organ toxicity. Over time, an in situ sHCC mouse model was established for in vivo CT imaging at multiple time points. The enhanced CT images processed using 3D U-Net and 3D Trans U-Net AI models demonstrated high geometric and dosimetric accuracy. Therefore, charged Au NPs enable accurate and automatic sHCC segmentation in CT images using classical AI models, potentially addressing the technical challenges related to tumor identification, alignment, and automatic segmentation in CT-guided online ART.
Collapse
Affiliation(s)
- Jianjun Lai
- Department
of Radiation Oncology, Zhejiang Hospital, Hangzhou 310013, China
- Instiute
of Intelligent Control and Robotics, Hangzhou
Dianzi University, Hangzhou 310018, China
| | - Zhizeng Luo
- Instiute
of Intelligent Control and Robotics, Hangzhou
Dianzi University, Hangzhou 310018, China
| | - Jiping Liu
- Department
of Radiation Physics, Zhejiang Cancer Hospital, Hangzhou 310022, China
| | - Haili Hu
- Department
of Radiation Oncology, Zhejiang Hospital, Hangzhou 310013, China
| | - Hao Jiang
- Department
of Radiation Oncology, Zhejiang Hospital, Hangzhou 310013, China
| | - Pengyuan Liu
- Department
of Radiation Oncology, Zhejiang Hospital, Hangzhou 310013, China
| | - Li He
- School
of Pharmaceutical Sciences, Zhejiang Chinese
Medical University, Hangzhou 310053, China
| | - Weiyi Cheng
- School
of Pharmaceutical Sciences, Zhejiang Chinese
Medical University, Hangzhou 310053, China
| | - Weiye Ren
- School
of Pharmaceutical Sciences, Zhejiang Chinese
Medical University, Hangzhou 310053, China
| | - Yajun Wu
- Department
of Pharmacy, Zhejiang Hospital, Hangzhou 310013, China
| | - Ji-Gang Piao
- School
of Pharmaceutical Sciences, Zhejiang Chinese
Medical University, Hangzhou 310053, China
| | - Zhibing Wu
- Department
of Radiation Oncology, Zhejiang Hospital, Hangzhou 310013, China
- Department
of Radiation Oncology, Affiliated Zhejiang
Hospital, Zhejiang University School of Medicine, Hangzhou 310013, China
| |
Collapse
|
20
|
Wu H, Min W, Gai D, Huang Z, Geng Y, Wang Q, Chen R. HD-Former: A hierarchical dependency Transformer for medical image segmentation. Comput Biol Med 2024; 178:108671. [PMID: 38870721 DOI: 10.1016/j.compbiomed.2024.108671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 04/20/2024] [Accepted: 05/26/2024] [Indexed: 06/15/2024]
Abstract
Medical image segmentation is a compelling fundamental problem and an important auxiliary tool for clinical applications. Recently, the Transformer model has emerged as a valuable tool for addressing the limitations of convolutional neural networks by effectively capturing global relationships and numerous hybrid architectures combining convolutional neural networks (CNNs) and Transformer have been devised to enhance segmentation performance. However, they suffer from multilevel semantic feature gaps and fail to account for multilevel dependencies between space and channel. In this paper, we propose a hierarchical dependency Transformer for medical image segmentation, named HD-Former. First, we utilize a Compressed Bottleneck (CB) module to enrich shallow features and localize the target region. We then introduce the Dual Cross Attention Transformer (DCAT) module to fuse multilevel features and bridge the feature gap. In addition, we design the broad exploration network (BEN) that cascades convolution and self-attention from different percepts to capture hierarchical dense contextual semantic features locally and globally. Finally, we exploit uncertain multitask edge loss to adaptively map predictions to a consistent feature space, which can optimize segmentation edges. The extensive experiments conducted on medical image segmentation from ISIC, LiTS, Kvasir-SEG, and CVC-ClinicDB datasets demonstrate that our HD-Former surpasses the state-of-the-art methods in terms of both subjective visual performance and objective evaluation. Code: https://github.com/barcelonacontrol/HD-Former.
Collapse
Affiliation(s)
- Haifan Wu
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China.
| | - Weidong Min
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China; Institute of Metaverse, Nanchang University, Nanchang, 330031, China; Jiangxi Key Laboratory of Virtual Reality, Nanchang, 330031, China.
| | - Di Gai
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China; Institute of Metaverse, Nanchang University, Nanchang, 330031, China; Jiangxi Key Laboratory of Virtual Reality, Nanchang, 330031, China.
| | - Zheng Huang
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China.
| | - Yuhan Geng
- School of Public Health, University of Michigan, Ann Arbor, MI, 48105, USA.
| | - Qi Wang
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China; Institute of Metaverse, Nanchang University, Nanchang, 330031, China; Jiangxi Key Laboratory of Virtual Reality, Nanchang, 330031, China.
| | - Ruibin Chen
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China; Information Department, The First Affiliated Hospital of Nanchang University, Nanchang, 330096, China.
| |
Collapse
|
21
|
Hsiao CH, Lin FYS, Sun TL, Liao YY, Wu CH, Lai YC, Wu HP, Liu PR, Xiao BR, Chen CH, Huang Y. Precision and Robust Models on Healthcare Institution Federated Learning for Predicting HCC on Portal Venous CT Images. IEEE J Biomed Health Inform 2024; 28:4674-4687. [PMID: 38739503 DOI: 10.1109/jbhi.2024.3400599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Hepatocellular carcinoma (HCC), the most common type of liver cancer, poses significant challenges in detection and diagnosis. Medical imaging, especially computed tomography (CT), is pivotal in non-invasively identifying this disease, requiring substantial expertise for interpretation. This research introduces an innovative strategy that integrates two-dimensional (2D) and three-dimensional (3D) deep learning models within a federated learning (FL) framework for precise segmentation of liver and tumor regions in medical images. The study utilized 131 CT scans from the Liver Tumor Segmentation (LiTS) challenge and demonstrated the superior efficiency and accuracy of the proposed Hybrid-ResUNet model with a Dice score of 0.9433 and an AUC of 0.9965 compared to ResNet and EfficientNet models. This FL approach is beneficial for conducting large-scale clinical trials while safeguarding patient privacy across healthcare settings. It facilitates active engagement in problem-solving, data collection, model development, and refinement. The study also addresses data imbalances in the FL context, showing resilience and highlighting local models' robust performance. Future research will concentrate on refining federated learning algorithms and their incorporation into the continuous implementation and deployment (CI/CD) processes in AI system operations, emphasizing the dynamic involvement of clients. We recommend a collaborative human-AI endeavor to enhance feature extraction and knowledge transfer. These improvements are intended to boost equitable and efficient data collaboration across various sectors in practical scenarios, offering a crucial guide for forthcoming research in medical AI.
Collapse
|
22
|
Jiao C, Lao Y, Zhang W, Braunstein S, Salans M, Villanueva-Meyer JE, Hervey-Jumper SL, Yang B, Morin O, Valdes G, Fan Z, Shiroishi M, Zada G, Sheng K, Yang W. Multi-modal fusion and feature enhancement U-Net coupling with stem cell niches proximity estimation for voxel-wise GBM recurrence prediction . Phys Med Biol 2024; 69:10.1088/1361-6560/ad64b8. [PMID: 39019073 PMCID: PMC11308744 DOI: 10.1088/1361-6560/ad64b8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Accepted: 07/17/2024] [Indexed: 07/19/2024]
Abstract
Objective.We aim to develop a Multi-modal Fusion and Feature Enhancement U-Net (MFFE U-Net) coupling with stem cell niche proximity estimation to improve voxel-wise Glioblastoma (GBM) recurrence prediction.Approach.57 patients with pre- and post-surgery magnetic resonance (MR) scans were retrospectively solicited from 4 databases. Post-surgery MR scans included two months before the clinical diagnosis of recurrence and the day of the radiologicaly confirmed recurrence. The recurrences were manually annotated on the T1ce. The high-risk recurrence region was first determined. Then, a sparse multi-modal feature fusion U-Net was developed. The 50 patients from 3 databases were divided into 70% training, 10% validation, and 20% testing. 7 patients from the 4th institution were used as external testing with transfer learning. Model performance was evaluated by recall, precision, F1-score, and Hausdorff Distance at the 95% percentile (HD95). The proposed MFFE U-Net was compared to the support vector machine (SVM) model and two state-of-the-art neural networks. An ablation study was performed.Main results.The MFFE U-Net achieved a precision of 0.79 ± 0.08, a recall of 0.85 ± 0.11, and an F1-score of 0.82 ± 0.09. Statistically significant improvement was observed when comparing MFFE U-Net with proximity estimation couple SVM (SVMPE), mU-Net, and Deeplabv3. The HD95 was 2.75 ± 0.44 mm and 3.91 ± 0.83 mm for the 10 patients used in the model construction and 7 patients used for external testing, respectively. The ablation test showed that all five MR sequences contributed to the performance of the final model, with T1ce contributing the most. Convergence analysis, time efficiency analysis, and visualization of the intermediate results further discovered the characteristics of the proposed method.Significance. We present an advanced MFFE learning framework, MFFE U-Net, for effective voxel-wise GBM recurrence prediction. MFFE U-Net performs significantly better than the state-of-the-art networks and can potentially guide early RT intervention of the disease recurrence.
Collapse
Affiliation(s)
- Changzhe Jiao
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143
| | - Yi Lao
- Department of Radiation Oncology, UC Los Angeles, Los Angeles, CA 90095
| | - Wenwen Zhang
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143
| | - Steve Braunstein
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143
| | - Mia Salans
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143
| | | | | | - Bo Yang
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143
| | - Olivier Morin
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143
| | - Gilmer Valdes
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143
| | - Zhaoyang Fan
- Department of Radiology, University of Southern California, Los Angeles, CA 90033
| | - Mark Shiroishi
- Department of Radiology, University of Southern California, Los Angeles, CA 90033
| | - Gabriel Zada
- Department of Neurosurgery, University of Southern California, Los Angeles, CA 90033
| | - Ke Sheng
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143
| | - Wensha Yang
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143
| |
Collapse
|
23
|
Kumar K, Yeo AU, McIntosh L, Kron T, Wheeler G, Franich RD. Deep Learning Auto-Segmentation Network for Pediatric Computed Tomography Data Sets: Can We Extrapolate From Adults? Int J Radiat Oncol Biol Phys 2024; 119:1297-1306. [PMID: 38246249 DOI: 10.1016/j.ijrobp.2024.01.201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Revised: 12/10/2023] [Accepted: 01/07/2024] [Indexed: 01/23/2024]
Abstract
PURPOSE Artificial intelligence (AI)-based auto-segmentation models hold promise for enhanced efficiency and consistency in organ contouring for adaptive radiation therapy and radiation therapy planning. However, their performance on pediatric computed tomography (CT) data and cross-scanner compatibility remain unclear. This study aimed to evaluate the performance of AI-based auto-segmentation models trained on adult CT data when applied to pediatric data sets and explore the improvement in performance gained by including pediatric training data. It also examined their ability to accurately segment CT data acquired from different scanners. METHODS AND MATERIALS Using the nnU-Net framework, segmentation models were trained on data sets of adult, pediatric, and combined CT scans for 7 pelvic/thoracic organs. Each model was trained on 290 to 300 cases per category and organ. Training data sets included a combination of clinical data and several open repositories. The study incorporated a database of 459 pediatric (0-16 years) CT scans and 950 adults (>18 years), ensuring all scans had human expert ground-truth contours of the selected organs. Performance was evaluated based on Dice similarity coefficients (DSC) of the model-generated contours. RESULTS AI models trained exclusively on adult data underperformed on pediatric data, especially for the 0 to 2 age group: mean DSC was below 0.5 for the bladder and spleen. The addition of pediatric training data demonstrated significant improvement for all age groups, achieving a mean DSC of above 0.85 for all organs in every age group. Larger organs like the liver and kidneys maintained consistent performance for all models across age groups. No significant difference emerged in the cross-scanner performance evaluation, suggesting robust cross-scanner generalization. CONCLUSIONS For optimal segmentation across age groups, it is important to include pediatric data in the training of segmentation models. The successful cross-scanner generalization also supports the real-world clinical applicability of these AI models. This study emphasizes the significance of data set diversity in training robust AI systems for medical image interpretation tasks.
Collapse
Affiliation(s)
- Kartik Kumar
- Physical Sciences Department, Peter MacCallum Cancer Centre, Victoria, Australia; School of Science, RMIT University, Melbourne, Victoria, Australia
| | - Adam U Yeo
- Physical Sciences Department, Peter MacCallum Cancer Centre, Victoria, Australia; School of Science, RMIT University, Melbourne, Victoria, Australia; Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria, Australia
| | - Lachlan McIntosh
- Physical Sciences Department, Peter MacCallum Cancer Centre, Victoria, Australia; School of Science, RMIT University, Melbourne, Victoria, Australia
| | - Tomas Kron
- Physical Sciences Department, Peter MacCallum Cancer Centre, Victoria, Australia; School of Science, RMIT University, Melbourne, Victoria, Australia; Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria, Australia; Centre for Medical Radiation Physics, University of Wollongong, Wollongong, New South Wales, Australia
| | - Greg Wheeler
- Physical Sciences Department, Peter MacCallum Cancer Centre, Victoria, Australia; Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria, Australia
| | - Rick D Franich
- Physical Sciences Department, Peter MacCallum Cancer Centre, Victoria, Australia; School of Science, RMIT University, Melbourne, Victoria, Australia.
| |
Collapse
|
24
|
Chen Y, Che H, Ji Z, Qin J, Huang Y, Wu J. The Recurrent U-Net for Needle Segmentation in Ultrasound Image-Guided Surgery . ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2024; 2024:1-4. [PMID: 40039871 DOI: 10.1109/embc53108.2024.10782677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
In minimally invasive surgery, poor needle visualization under ultrasound has been one of the challenges of the surgery. To improve the resulting puncture error, it is effective to use deep learning from the image level to assist the surgeon in locating the needle tip position and presenting the needle trajectory angle. In this paper, a network structure based on the U-shaped network and convolutional gated recurrent unit is proposed, which could capture the information about the movement of the needle and improve the accuracy of the needle segmentation. The input of the network are three temporally consecutive ultrasound images, and the output is the segmentation of the last frame. The network is trained and tested on our bovine liver dataset containing 192 videos captured from 20 different livers. Our proposed network achieved 85% precision, 86% recall, 85% F1-score, and 76% IoU, and could localize the needle with a mean tip error of 0.66mm and a mean trajectory angle error of 0.72°. This work demonstrates the effectiveness of convolutional neural networks in improving needle visibility for minimally invasive surgical procedures. The incorporation of needle motion information into the network framework significantly boosts the segmentation precision.
Collapse
|
25
|
Huang X, Gong H, Zhang J. HST-MRF: Heterogeneous Swin Transformer With Multi-Receptive Field for Medical Image Segmentation. IEEE J Biomed Health Inform 2024; 28:4048-4061. [PMID: 38709610 DOI: 10.1109/jbhi.2024.3397047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
The Transformer has been successfully used in medical image segmentation due to its excellent long-range modeling capabilities. However, patch segmentation is necessary when building a Transformer class model. This process ignores the tissue structure features within patch, resulting in the loss of shallow representation information. In this study, we propose a Heterogeneous Swin Transformer with Multi-Receptive Field (HST-MRF) model that fuses patch information from different receptive fields to solve the problem of loss of feature information caused by patch segmentation. The heterogeneous Swin Transformer (HST) is the core module, which achieves the interaction of multi-receptive field patch information through heterogeneous attention and passes it to the next stage for progressive learning, thus complementing the patch structure information. We also designed a two-stage fusion module, multimodal bilinear pooling (MBP), to assist HST in further fusing multi-receptive field information and combining low-level and high-level semantic information for accurate localization of lesion regions. In addition, we developed adaptive patch embedding (APE) and soft channel attention (SCA) modules to retain more valuable information when acquiring patch embedding and filtering channel features, respectively, thereby improving model segmentation quality. We evaluated HST-MRF on multiple datasets for polyp, skin lesion and breast ultrasound segmentation tasks. Experimental results show that our proposed method outperforms state-of-the-art models and can achieve superior performance. Furthermore, we verified the effectiveness of each module and the benefits of multi-receptive field segmentation in reducing the loss of structural information through ablation experiments and qualitative analysis.
Collapse
|
26
|
Lin H, Zhao M, Zhu L, Pei X, Wu H, Zhang L, Li Y. Gaussian filter facilitated deep learning-based architecture for accurate and efficient liver tumor segmentation for radiation therapy. Front Oncol 2024; 14:1423774. [PMID: 38966060 PMCID: PMC11222586 DOI: 10.3389/fonc.2024.1423774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Accepted: 06/06/2024] [Indexed: 07/06/2024] Open
Abstract
Purpose Addressing the challenges of unclear tumor boundaries and the confusion between cysts and tumors in liver tumor segmentation, this study aims to develop an auto-segmentation method utilizing Gaussian filter with the nnUNet architecture to effectively distinguish between tumors and cysts, enhancing the accuracy of liver tumor auto-segmentation. Methods Firstly, 130 cases of liver tumorsegmentation challenge 2017 (LiTS2017) were used for training and validating nnU-Net-based auto-segmentation model. Then, 14 cases of 3D-IRCADb dataset and 25 liver cancer cases retrospectively collected in our hospital were used for testing. The dice similarity coefficient (DSC) was used to evaluate the accuracy of auto-segmentation model by comparing with manual contours. Results The nnU-Net achieved an average DSC value of 0.86 for validation set (20 LiTS cases) and 0.82 for public testing set (14 3D-IRCADb cases). For clinical testing set, the standalone nnU-Net model achieved an average DSC value of 0.75, which increased to 0.81 after post-processing with the Gaussian filter (P<0.05), demonstrating its effectiveness in mitigating the influence of liver cysts on liver tumor segmentation. Conclusion Experiments show that Gaussian filter is beneficial to improve the accuracy of liver tumor segmentation in clinic.
Collapse
Affiliation(s)
- Hongyu Lin
- Department of Oncology, First Hospital of Hebei Medical University, Shijiazhuang, China
| | - Min Zhao
- Department of Oncology, First Hospital of Hebei Medical University, Shijiazhuang, China
| | - Lingling Zhu
- Department of Oncology, First Hospital of Hebei Medical University, Shijiazhuang, China
| | - Xi Pei
- Technology Development Department, Anhui Wisdom Technology Co., Ltd., Hefei, China
| | - Haotian Wu
- Technology Development Department, Anhui Wisdom Technology Co., Ltd., Hefei, China
| | - Lian Zhang
- Department of Oncology, First Hospital of Hebei Medical University, Shijiazhuang, China
| | - Ying Li
- Department of Oncology, First Hospital of Hebei Medical University, Shijiazhuang, China
| |
Collapse
|
27
|
Luo J, Dai P, He Z, Huang Z, Liao S, Liu K. Deep learning models for ischemic stroke lesion segmentation in medical images: A survey. Comput Biol Med 2024; 175:108509. [PMID: 38677171 DOI: 10.1016/j.compbiomed.2024.108509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 02/09/2024] [Accepted: 04/21/2024] [Indexed: 04/29/2024]
Abstract
This paper provides a comprehensive review of deep learning models for ischemic stroke lesion segmentation in medical images. Ischemic stroke is a severe neurological disease and a leading cause of death and disability worldwide. Accurate segmentation of stroke lesions in medical images such as MRI and CT scans is crucial for diagnosis, treatment planning and prognosis. This paper first introduces common imaging modalities used for stroke diagnosis, discussing their capabilities in imaging lesions at different disease stages from the acute to chronic stage. It then reviews three major public benchmark datasets for evaluating stroke segmentation algorithms: ATLAS, ISLES and AISD, highlighting their key characteristics. The paper proceeds to provide an overview of foundational deep learning architectures for medical image segmentation, including CNN-based and transformer-based models. It summarizes recent innovations in adapting these architectures to the task of stroke lesion segmentation across the three datasets, analyzing their motivations, modifications and results. A survey of loss functions and data augmentations employed for this task is also included. The paper discusses various aspects related to stroke segmentation tasks, including prior knowledge, small lesions, and multimodal fusion, and then concludes by outlining promising future research directions. Overall, this comprehensive review covers critical technical developments in the field to support continued progress in automated stroke lesion segmentation.
Collapse
Affiliation(s)
- Jialin Luo
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Peishan Dai
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China.
| | - Zhuang He
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Zhongchao Huang
- Department of Biomedical Engineering, School of Basic Medical Science, Central South University, Changsha, Hunan, China
| | - Shenghui Liao
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Kun Liu
- Brain Hospital of Hunan Province (The Second People's Hospital of Hunan Province), Changsha, Hunan, China
| |
Collapse
|
28
|
Zhou Z, Islam MT, Xing L. Multibranch CNN With MLP-Mixer-Based Feature Exploration for High-Performance Disease Diagnosis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:7351-7362. [PMID: 37028335 PMCID: PMC11779602 DOI: 10.1109/tnnls.2023.3250490] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Deep learning-based diagnosis is becoming an indispensable part of modern healthcare. For high-performance diagnosis, the optimal design of deep neural networks (DNNs) is a prerequisite. Despite its success in image analysis, existing supervised DNNs based on convolutional layers often suffer from their rudimentary feature exploration ability caused by the limited receptive field and biased feature extraction of conventional convolutional neural networks (CNNs), which compromises the network performance. Here, we propose a novel feature exploration network named manifold embedded multilayer perceptron (MLP) mixer (ME-Mixer), which utilizes both supervised and unsupervised features for disease diagnosis. In the proposed approach, a manifold embedding network is employed to extract class-discriminative features; then, two MLP-Mixer-based feature projectors are adopted to encode the extracted features with the global reception field. Our ME-Mixer network is quite general and can be added as a plugin to any existing CNN. Comprehensive evaluations on two medical datasets are performed. The results demonstrate that their approach greatly enhances the classification accuracy in comparison with different configurations of DNNs with acceptable computational complexity.
Collapse
|
29
|
Liu H, Zhou Y, Gou S, Luo Z. Tumor conspicuity enhancement-based segmentation model for liver tumor segmentation and RECIST diameter measurement in non-contrast CT images. Comput Biol Med 2024; 174:108420. [PMID: 38613896 DOI: 10.1016/j.compbiomed.2024.108420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 04/04/2024] [Accepted: 04/04/2024] [Indexed: 04/15/2024]
Abstract
BACKGROUND AND OBJECTIVE Liver tumor segmentation (LiTS) accuracy on contrast-enhanced computed tomography (CECT) images is higher than that on non-contrast computed tomography (NCCT) images. However, CECT requires contrast medium and repeated scans to obtain multiphase enhanced CT images, which is time-consuming and cost-increasing. Therefore, despite the lower accuracy of LiTS on NCCT images, which still plays an irreplaceable role in some clinical settings, such as guided brachytherapy, ablation, or evaluation of patients with renal function damage. In this study, we intend to generate enhanced high-contrast pseudo-color CT (PCCT) images to improve the accuracy of LiTS and RECIST diameter measurement on NCCT images. METHODS To generate high-contrast CT liver tumor region images, an intensity-based tumor conspicuity enhancement (ITCE) model was first developed. In the ITCE model, a pseudo color conversion function from an intensity distribution of the tumor was established, and it was applied in NCCT to generate enhanced PCCT images. Additionally, we design a tumor conspicuity enhancement-based liver tumor segmentation (TCELiTS) model, which was applied to improve the segmentation of liver tumors on NCCT images. The TCELiTS model consists of three components: an image enhancement module based on the ITCE model, a segmentation module based on a deep convolutional neural network, and an attention loss module based on restricted activation. Segmentation performance was analyzed using the Dice similarity coefficient (DSC), sensitivity, specificity, and RECIST diameter error. RESULTS To develop the deep learning model, 100 patients with histopathologically confirmed liver tumors (hepatocellular carcinoma, 64 patients; hepatic hemangioma, 36 patients) were randomly divided into a training set (75 patients) and an independent test set (25 patients). Compared with existing tumor automatic segmentation networks trained on CECT images (U-Net, nnU-Net, DeepLab-V3, Modified U-Net), the DSCs achieved on the enhanced PCCT images are both improved compared with those on NCCT images. We observe improvements of 0.696-0.713, 0.715 to 0.776, 0.748 to 0.788, and 0.733 to 0.799 in U-Net, nnU-Net, DeepLab-V3, and Modified U-Net, respectively, in terms of DSC values. In addition, an observer study including 5 doctors was conducted to compare the segmentation performance of enhanced PCCT images with that of NCCT images and showed that enhanced PCCT images are more advantageous for doctors to segment tumor regions. The results showed an accuracy improvement of approximately 3%-6%, but the time required to segment a single CT image was reduced by approximately 50 %. CONCLUSIONS Experimental results show that the ITCE model can generate high-contrast enhanced PCCT images, especially in liver regions, and the TCELiTS model can improve LiTS accuracy in NCCT images.
Collapse
Affiliation(s)
- Haofeng Liu
- School of Artificial Intelligence, Xidian University, Xi'An, 710071, China
| | - Yanyan Zhou
- Department of Interventional Radiology, Tangdu Hospital, Airforce Medical University, Xi'an, 710038, China
| | - Shuiping Gou
- School of Artificial Intelligence, Xidian University, Xi'An, 710071, China
| | - Zhonghua Luo
- Department of Interventional Radiology, Tangdu Hospital, Airforce Medical University, Xi'an, 710038, China.
| |
Collapse
|
30
|
Liu H, Yang J, Jiang C, He S, Fu Y, Zhang S, Hu X, Fang J, Ji W. S2DA-Net: Spatial and spectral-learning double-branch aggregation network for liver tumor segmentation in CT images. Comput Biol Med 2024; 174:108400. [PMID: 38613888 DOI: 10.1016/j.compbiomed.2024.108400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 03/10/2024] [Accepted: 04/01/2024] [Indexed: 04/15/2024]
Abstract
Accurate liver tumor segmentation is crucial for aiding radiologists in hepatocellular carcinoma evaluation and surgical planning. While convolutional neural networks (CNNs) have been successful in medical image segmentation, they face challenges in capturing long-term dependencies among pixels. On the other hand, Transformer-based models demand a high number of parameters and involve significant computational costs. To address these issues, we propose the Spatial and Spectral-learning Double-branched Aggregation Network (S2DA-Net) for liver tumor segmentation. S2DA-Net consists of a double-branched encoder and a decoder with a Group Multi-Head Cross-Attention Aggregation (GMCA) module, Two branches in the encoder consist of a Fourier Spectral-learning Multi-scale Fusion (FSMF) branch and a Multi-axis Aggregation Hadamard Attention (MAHA) branch. The FSMF branch employs a Fourier-based network to learn amplitude and phase information, capturing richer features and detailed information without introducing an excessive number of parameters. The FSMF branch utilizes a Fourier-based network to capture amplitude and phase information, enriching features without introducing excessive parameters. The MAHA branch incorporates spatial information, enhancing discriminative features while minimizing computational costs. In the decoding path, a GMCA module extracts local information and establishes long-term dependencies, improving localization capabilities by amalgamating features from diverse branches. Experimental results on the public LiTS2017 liver tumor datasets show that the proposed segmentation model achieves significant improvements compared to the state-of-the-art methods, obtaining dice per case (DPC) 69.4 % and global dice (DG) 80.0 % for liver tumor segmentation on the LiTS2017 dataset. Meanwhile, the pre-trained model based on the LiTS2017 datasets obtain, DPC 73.4 % and an DG 82.2 % on the 3DIRCADb dataset.
Collapse
Affiliation(s)
- Huaxiang Liu
- Department Radiology of Taizhou Hospital, Zhejiang University, Taizhou, 318000, Zhejiang, China; Institute of Intelligent Information Processing, Taizhou University, Taizhou, 318000, Zhejiang, China; Key Laboratory of Evidence-based Radiology of Taizhou, Taizhou, 317000, Zhejiang, China
| | - Jie Yang
- School of Geophysics and Measurement and Control Technology, East China University of Technology, Nanchang, 330013, China
| | - Chao Jiang
- School of Geophysics and Measurement and Control Technology, East China University of Technology, Nanchang, 330013, China
| | - Sailing He
- Department Radiology of Taizhou Hospital, Zhejiang University, Taizhou, 318000, Zhejiang, China
| | - Youyao Fu
- Institute of Intelligent Information Processing, Taizhou University, Taizhou, 318000, Zhejiang, China
| | - Shiqing Zhang
- Institute of Intelligent Information Processing, Taizhou University, Taizhou, 318000, Zhejiang, China
| | - Xudong Hu
- Key Laboratory of Evidence-based Radiology of Taizhou, Taizhou, 317000, Zhejiang, China
| | - Jiangxiong Fang
- Institute of Intelligent Information Processing, Taizhou University, Taizhou, 318000, Zhejiang, China.
| | - Wenbin Ji
- Department Radiology of Taizhou Hospital, Zhejiang University, Taizhou, 318000, Zhejiang, China; Key Laboratory of Evidence-based Radiology of Taizhou, Taizhou, 317000, Zhejiang, China.
| |
Collapse
|
31
|
Wang KN, Li SX, Bu Z, Zhao FX, Zhou GQ, Zhou SJ, Chen Y. SBCNet: Scale and Boundary Context Attention Dual-Branch Network for Liver Tumor Segmentation. IEEE J Biomed Health Inform 2024; 28:2854-2865. [PMID: 38427554 DOI: 10.1109/jbhi.2024.3370864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/03/2024]
Abstract
Automated segmentation of liver tumors in CT scans is pivotal for diagnosing and treating liver cancer, offering a valuable alternative to labor-intensive manual processes and ensuring the provision of accurate and reliable clinical assessment. However, the inherent variability of liver tumors, coupled with the challenges posed by blurred boundaries in imaging characteristics, presents a substantial obstacle to achieving their precise segmentation. In this paper, we propose a novel dual-branch liver tumor segmentation model, SBCNet, to address these challenges effectively. Specifically, our proposed method introduces a contextual encoding module, which enables a better identification of tumor variability using an advanced multi-scale adaptive kernel. Moreover, a boundary enhancement module is designed for the counterpart branch to enhance the perception of boundaries by incorporating contour learning with the Sobel operator. Finally, we propose a hybrid multi-task loss function, concurrently concerning tumors' scale and boundary features, to foster interaction across different tasks of dual branches, further improving tumor segmentation. Experimental validation on the publicly available LiTS dataset demonstrates the practical efficacy of each module, with SBCNet yielding competitive results compared to other state-of-the-art methods for liver tumor segmentation.
Collapse
|
32
|
Zhang W, Tao Y, Huang Z, Li Y, Chen Y, Song T, Ma X, Zhang Y. Multi-phase features interaction transformer network for liver tumor segmentation and microvascular invasion assessment in contrast-enhanced CT. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:5735-5761. [PMID: 38872556 DOI: 10.3934/mbe.2024253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2024]
Abstract
Precise segmentation of liver tumors from computed tomography (CT) scans is a prerequisite step in various clinical applications. Multi-phase CT imaging enhances tumor characterization, thereby assisting radiologists in accurate identification. However, existing automatic liver tumor segmentation models did not fully exploit multi-phase information and lacked the capability to capture global information. In this study, we developed a pioneering multi-phase feature interaction Transformer network (MI-TransSeg) for accurate liver tumor segmentation and a subsequent microvascular invasion (MVI) assessment in contrast-enhanced CT images. In the proposed network, an efficient multi-phase features interaction module was introduced to enable bi-directional feature interaction among multiple phases, thus maximally exploiting the available multi-phase information. To enhance the model's capability to extract global information, a hierarchical transformer-based encoder and decoder architecture was designed. Importantly, we devised a multi-resolution scales feature aggregation strategy (MSFA) to optimize the parameters and performance of the proposed model. Subsequent to segmentation, the liver tumor masks generated by MI-TransSeg were applied to extract radiomic features for the clinical applications of the MVI assessment. With Institutional Review Board (IRB) approval, a clinical multi-phase contrast-enhanced CT abdominal dataset was collected that included 164 patients with liver tumors. The experimental results demonstrated that the proposed MI-TransSeg was superior to various state-of-the-art methods. Additionally, we found that the tumor mask predicted by our method showed promising potential in the assessment of microvascular invasion. In conclusion, MI-TransSeg presents an innovative paradigm for the segmentation of complex liver tumors, thus underscoring the significance of multi-phase CT data exploitation. The proposed MI-TransSeg network has the potential to assist radiologists in diagnosing liver tumors and assessing microvascular invasion.
Collapse
Affiliation(s)
- Wencong Zhang
- Department of Biomedical Engineering, College of Engineering, Shantou University, Shantou, China
- Department of Biomedical Engineering, College of Design and Engineering, National University of Singapore, Singapore
| | - Yuxi Tao
- Department of Radiology, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
| | - Zhanyao Huang
- Department of Biomedical Engineering, College of Engineering, Shantou University, Shantou, China
| | - Yue Li
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Yingjia Chen
- Department of Biomedical Engineering, College of Engineering, Shantou University, Shantou, China
| | - Tengfei Song
- Department of Radiology, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
| | - Xiangyuan Ma
- Department of Biomedical Engineering, College of Engineering, Shantou University, Shantou, China
| | - Yaqin Zhang
- Department of Biomedical Engineering, College of Engineering, Shantou University, Shantou, China
| |
Collapse
|
33
|
Zhang K, Yang X, Cui Y, Zhao J, Li D. Imaging segmentation mechanism for rectal tumors using improved U-Net. BMC Med Imaging 2024; 24:95. [PMID: 38654162 DOI: 10.1186/s12880-024-01269-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Accepted: 04/05/2024] [Indexed: 04/25/2024] Open
Abstract
OBJECTIVE In radiation therapy, cancerous region segmentation in magnetic resonance images (MRI) is a critical step. For rectal cancer, the automatic segmentation of rectal tumors from an MRI is a great challenge. There are two main shortcomings in existing deep learning-based methods that lead to incorrect segmentation: 1) there are many organs surrounding the rectum, and the shape of some organs is similar to that of rectal tumors; 2) high-level features extracted by conventional neural networks often do not contain enough high-resolution information. Therefore, an improved U-Net segmentation network based on attention mechanisms is proposed to replace the traditional U-Net network. METHODS The overall framework of the proposed method is based on traditional U-Net. A ResNeSt module was added to extract the overall features, and a shape module was added after the encoder layer. We then combined the outputs of the shape module and the decoder to obtain the results. Moreover, the model used different types of attention mechanisms, so that the network learned information to improve segmentation accuracy. RESULTS We validated the effectiveness of the proposed method using 3773 2D MRI datasets from 304 patients. The results showed that the proposed method achieved 0.987, 0.946, 0.897, and 0.899 for Dice, MPA, MioU, and FWIoU, respectively; these values are significantly better than those of other existing methods. CONCLUSION Due to time savings, the proposed method can help radiologists segment rectal tumors effectively and enable them to focus on patients whose cancerous regions are difficult for the network to segment. SIGNIFICANCE The proposed method can help doctors segment rectal tumors, thereby ensuring good diagnostic quality and accuracy.
Collapse
Affiliation(s)
- Kenan Zhang
- College of Electronic Information and Optical Engineering, Taiyuan University of Technology, Taiyuan, 030024, China
- Key Laboratory of Big Data Fusion Analysis and Application of Shanxi Province, Taiyuan, 030024, China
| | - Xiaotang Yang
- Department of Radiology, Shanxi Cancer Hospital, Shanxi Medical University, Taiyuan, 030013, China.
| | - Yanfen Cui
- Department of Radiology, Shanxi Cancer Hospital, Shanxi Medical University, Taiyuan, 030013, China
| | - Jumin Zhao
- College of Electronic Information and Optical Engineering, Taiyuan University of Technology, Taiyuan, 030024, China
- Key Laboratory of Big Data Fusion Analysis and Application of Shanxi Province, Taiyuan, 030024, China
- Intelligent Perception Engineering Technology Center of Shanxi, Taiyuan, 030024, China
| | - Dengao Li
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, 030024, China.
- Key Laboratory of Big Data Fusion Analysis and Application of Shanxi Province, Taiyuan, 030024, China.
- Intelligent Perception Engineering Technology Center of Shanxi, Taiyuan, 030024, China.
| |
Collapse
|
34
|
Yang X, Zheng Y, Mei C, Jiang G, Tian B, Wang L. UGLS: an uncertainty guided deep learning strategy for accurate image segmentation. Front Physiol 2024; 15:1362386. [PMID: 38651048 PMCID: PMC11033460 DOI: 10.3389/fphys.2024.1362386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Accepted: 03/26/2024] [Indexed: 04/25/2024] Open
Abstract
Accurate image segmentation plays a crucial role in computer vision and medical image analysis. In this study, we developed a novel uncertainty guided deep learning strategy (UGLS) to enhance the performance of an existing neural network (i.e., U-Net) in segmenting multiple objects of interest from images with varying modalities. In the developed UGLS, a boundary uncertainty map was introduced for each object based on its coarse segmentation (obtained by the U-Net) and then combined with input images for the fine segmentation of the objects. We validated the developed method by segmenting optic cup (OC) regions from color fundus images and left and right lung regions from Xray images. Experiments on public fundus and Xray image datasets showed that the developed method achieved a average Dice Score (DS) of 0.8791 and a sensitivity (SEN) of 0.8858 for the OC segmentation, and 0.9605, 0.9607, 0.9621, and 0.9668 for the left and right lung segmentation, respectively. Our method significantly improved the segmentation performance of the U-Net, making it comparable or superior to five sophisticated networks (i.e., AU-Net, BiO-Net, AS-Net, Swin-Unet, and TransUNet).
Collapse
Affiliation(s)
- Xiaoguo Yang
- Wenzhou People’s Hospital, The Third Affiliated Hospital of Shanghai University, Wenzhou, China
| | - Yanyan Zheng
- Wenzhou People’s Hospital, The Third Affiliated Hospital of Shanghai University, Wenzhou, China
| | - Chenyang Mei
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, China
| | - Gaoqiang Jiang
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, China
| | - Bihan Tian
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, China
| | - Lei Wang
- School of Ophthalmology and Optometry, Eye Hospital, Wenzhou Medical University, Wenzhou, China
| |
Collapse
|
35
|
Zhan F, Wang W, Chen Q, Guo Y, He L, Wang L. Three-Direction Fusion for Accurate Volumetric Liver and Tumor Segmentation. IEEE J Biomed Health Inform 2024; 28:2175-2186. [PMID: 38109246 DOI: 10.1109/jbhi.2023.3344392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2023]
Abstract
Biomedical image segmentation of organs, tissues and lesions has gained increasing attention in clinical treatment planning and navigation, which involves the exploration of two-dimensional (2D) and three-dimensional (3D) contexts in the biomedical image. Compared to 2D methods, 3D methods pay more attention to inter-slice correlations, which offer additional spatial information for image segmentation. An organ or tumor has a 3D structure that can be observed from three directions. Previous studies focus only on the vertical axis, limiting the understanding of the relationship between a tumor and its surrounding tissues. Important information can also be obtained from sagittal and coronal axes. Therefore, spatial information of organs and tumors can be obtained from three directions, i.e. the sagittal, coronal and vertical axes, to understand better the invasion depth of tumor and its relationship with the surrounding tissues. Moreover, the edges of organs and tumors in biomedical image may be blurred. To address these problems, we propose a three-direction fusion volumetric segmentation (TFVS) model for segmenting 3D biomedical images from three perspectives in sagittal, coronal and transverse planes, respectively. We use the dataset of the liver task provided by the Medical Segmentation Decathlon challenge to train our model. The TFVS method demonstrates a competitive performance on the 3D-IRCADB dataset. In addition, the t-test and Wilcoxon signed-rank test are also performed to show the statistical significance of the improvement by the proposed method as compared with the baseline methods. The proposed method is expected to be beneficial in guiding and facilitating clinical diagnosis and treatment.
Collapse
|
36
|
Lin Y, Wang J, Liu Q, Zhang K, Liu M, Wang Y. CFANet: Context fusing attentional network for preoperative CT image segmentation in robotic surgery. Comput Biol Med 2024; 171:108115. [PMID: 38402837 DOI: 10.1016/j.compbiomed.2024.108115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 01/30/2024] [Accepted: 02/04/2024] [Indexed: 02/27/2024]
Abstract
Accurate segmentation of CT images is crucial for clinical diagnosis and preoperative evaluation of robotic surgery, but challenges arise from fuzzy boundaries and small-sized targets. In response, a novel 2D segmentation network named Context Fusing Attentional Network (CFANet) is proposed. CFANet incorporates three key modules to address these challenges, namely pyramid fusing module (PFM), parallel dilated convolution module (PDCM) and scale attention module (SAM). Integration of these modules into the encoder-decoder structure enables effective utilization of multi-level and multi-scale features. Compared with advanced segmentation method, the Dice score improved by 2.14% on the dataset of liver tumor. This improvement is expected to have a positive impact on the preoperative evaluation of robotic surgery and to support clinical diagnosis, especially in early tumor detection.
Collapse
Affiliation(s)
- Yao Lin
- College of Electrical and Information Engineering, Hunan University, Changsha, 410082, China; National Engineering Research Center of Robot Visual Perception and Control Technology, Hunan University, Changsha, 410082, China
| | - Jiazheng Wang
- College of Electrical and Information Engineering, Hunan University, Changsha, 410082, China; National Engineering Research Center of Robot Visual Perception and Control Technology, Hunan University, Changsha, 410082, China.
| | - Qinghao Liu
- College of Electrical and Information Engineering, Hunan University, Changsha, 410082, China; National Engineering Research Center of Robot Visual Perception and Control Technology, Hunan University, Changsha, 410082, China
| | - Kang Zhang
- College of Electrical and Information Engineering, Hunan University, Changsha, 410082, China; National Engineering Research Center of Robot Visual Perception and Control Technology, Hunan University, Changsha, 410082, China
| | - Min Liu
- College of Electrical and Information Engineering, Hunan University, Changsha, 410082, China; National Engineering Research Center of Robot Visual Perception and Control Technology, Hunan University, Changsha, 410082, China; Research Institute of Hunan University in Chongqing, Chongqing, 401135, China.
| | - Yaonan Wang
- College of Electrical and Information Engineering, Hunan University, Changsha, 410082, China; National Engineering Research Center of Robot Visual Perception and Control Technology, Hunan University, Changsha, 410082, China
| |
Collapse
|
37
|
Ling Y, Wang Y, Liu Q, Yu J, Xu L, Zhang X, Liang P, Kong D. EPolar-UNet: An edge-attending polar UNet for automatic medical image segmentation with small datasets. Med Phys 2024; 51:1702-1713. [PMID: 38299370 DOI: 10.1002/mp.16957] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Revised: 12/29/2023] [Accepted: 01/14/2024] [Indexed: 02/02/2024] Open
Abstract
BACKGROUND Medical image segmentation is one of the most key steps in computer-aided clinical diagnosis, geometric characterization, measurement, image registration, and so forth. Convolutional neural networks especially UNet and its variants have been successfully used in many medical image segmentation tasks. However, the results are limited by the deficiency in extracting high resolution edge information because of the design of the skip connections in UNet and the need for large available datasets. PURPOSE In this paper, we proposed an edge-attending polar UNet (EPolar-UNet), which was trained on the polar coordinate system instead of classic Cartesian coordinate system with an edge-attending construction in skip connection path. METHODS EPolar-UNet extracted the location information from an eight-stacked hourglass network as the pole for polar transformation and extracted the boundary cues from an edge-attending UNet, which consisted of a deconvolution layer and a subtraction operation. RESULTS We evaluated the performance of EPolar-UNet across three imaging modalities for different segmentation tasks: CVC-ClinicDB dataset for polyp, ISIC-2018 dataset for skin lesion, and our private ultrasound dataset for liver tumor segmentation. Our proposed model outperformed state-of-the-art models on all three datasets and needed only 30%-60% of training data compared with the benchmark UNet model to achieve similar performances for medical image segmentation tasks. CONCLUSIONS We proposed an end-to-end EPolar-UNet for automatic medical image segmentation and showed good performance on small datasets, which was critical in the field of medical image segmentation.
Collapse
Affiliation(s)
- Yating Ling
- School of Mathematical Sciences, Zhejiang University, Hangzhou, China
| | - Yuling Wang
- Department of Interventional Ultrasound, The Fifth Medical Center of Chinese PLA General Hospital, Beijing, China
| | - Qian Liu
- School of Mathematical Sciences, Zhejiang University, Hangzhou, China
| | - Jie Yu
- Department of Interventional Ultrasound, The Fifth Medical Center of Chinese PLA General Hospital, Beijing, China
| | - Lei Xu
- Zhejiang Qiushi Institute for Mathematical Medicine, Hangzhou, China
| | - Xiaoqian Zhang
- School of Mathematical Sciences, Zhejiang University, Hangzhou, China
| | - Ping Liang
- Department of Interventional Ultrasound, The Fifth Medical Center of Chinese PLA General Hospital, Beijing, China
| | - Dexing Kong
- School of Mathematical Sciences, Zhejiang University, Hangzhou, China
| |
Collapse
|
38
|
Shi J, Wang Z, Ruan S, Zhao M, Zhu Z, Kan H, An H, Xue X, Yan B. Rethinking automatic segmentation of gross target volume from a decoupling perspective. Comput Med Imaging Graph 2024; 112:102323. [PMID: 38171254 DOI: 10.1016/j.compmedimag.2023.102323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 10/19/2023] [Accepted: 12/12/2023] [Indexed: 01/05/2024]
Abstract
Accurate and reliable segmentation of Gross Target Volume (GTV) is critical in cancer Radiation Therapy (RT) planning, but manual delineation is time-consuming and subject to inter-observer variations. Recently, deep learning methods have achieved remarkable success in medical image segmentation. However, due to the low image contrast and extreme pixel imbalance between GTV and adjacent tissues, most existing methods usually obtained limited performance on automatic GTV segmentation. In this paper, we propose a Heterogeneous Cascade Framework (HCF) from a decoupling perspective, which decomposes the GTV segmentation into independent recognition and segmentation subtasks. The former aims to screen out the abnormal slices containing GTV, while the latter performs pixel-wise segmentation of these slices. With the decoupled two-stage framework, we can efficiently filter normal slices to reduce false positives. To further improve the segmentation performance, we design a multi-level Spatial Alignment Network (SANet) based on the feature pyramid structure, which introduces a spatial alignment module into the decoder to compensate for the information loss caused by downsampling. Moreover, we propose a Combined Regularization (CR) loss and Balance-Sampling Strategy (BSS) to alleviate the pixel imbalance problem and improve network convergence. Extensive experiments on two public datasets of StructSeg2019 challenge demonstrate that our method outperforms state-of-the-art methods, especially with significant advantages in reducing false positives and accurately segmenting small objects. The code is available at https://github.com/shijun18/GTV_AutoSeg.
Collapse
Affiliation(s)
- Jun Shi
- School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026, China.
| | - Zhaohui Wang
- School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026, China.
| | - Shulan Ruan
- School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026, China.
| | - Minfan Zhao
- School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026, China.
| | - Ziqi Zhu
- School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026, China.
| | - Hongyu Kan
- School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026, China.
| | - Hong An
- School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026, China; Laoshan Laboratory Qingdao, Qindao, 266221, China.
| | - Xudong Xue
- Hubei Cancer Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Bing Yan
- Department of radiation oncology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230001, China.
| |
Collapse
|
39
|
Liu Z, Hou J, Pan X, Zhang R, Shi Z. PA-Net: A phase attention network fusing venous and arterial phase features of CT images for liver tumor segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 244:107997. [PMID: 38176329 DOI: 10.1016/j.cmpb.2023.107997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 12/15/2023] [Accepted: 12/25/2023] [Indexed: 01/06/2024]
Abstract
BACKGROUND AND OBJECTIVE Liver cancer seriously threatens human health. In clinical diagnosis, contrast-enhanced computed tomography (CECT) images provide important supplementary information for accurate liver tumor segmentation. However, most of the existing methods of liver tumor automatic segmentation focus only on single-phase image features. And the existing multi-modal methods have limited segmentation effect due to the redundancy of fusion features. In addition, the spatial misalignment of multi-phase images causes feature interference. METHODS In this paper, we propose a phase attention network (PA-Net) to adequately aggregate multi-phase information of CT images and improve segmentation performance for liver tumors. Specifically, we design a PA module to generate attention weight maps voxel by voxel to efficiently fuse multi-phase CT images features to avoid feature redundancy. In order to solve the problem of feature interference in the multi-phase image segmentation task, we design a new learning strategy and prove its effectiveness experimentally. RESULTS We conduct comparative experiments on the in-house clinical dataset and achieve the SOTA segmentation performance on multi-phase methods. In addition, our method has improved the mean dice score by 3.3% compared with the single-phase method based on nnUNet, and our learning strategy has improved the mean dice score by 1.51% compared with the ML strategy. CONCLUSION The experimental results show that our method is superior to the existing multi-phase liver tumor segmentation method, and provides a scheme for dealing with missing modalities in multi-modal tasks. In addition, our proposed learning strategy makes more effective use of arterial phase image information and is proven to be the most effective in liver tumor segmentation tasks using thick-layer CT images. The source code is released on (https://github.com/Houjunfeng203934/PA-Net).
Collapse
Affiliation(s)
- Zhenbing Liu
- School of Computer and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
| | - Junfeng Hou
- School of Computer and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
| | - Xipeng Pan
- School of Computer and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
| | - Ruojie Zhang
- The Second Affiliated Hospital of Guangxi Medical University, Nanning 530007, China
| | - Zhenwei Shi
- Department of Radiology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou 510080, China; Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangzhou 510080, China.
| |
Collapse
|
40
|
Seo H, Lee S, Yun S, Leem S, So S, Han DH. RenseNet: A Deep Learning Network Incorporating Residual and Dense Blocks with Edge Conservative Module to Improve Small-Lesion Classification and Model Interpretation. Cancers (Basel) 2024; 16:570. [PMID: 38339320 PMCID: PMC10854971 DOI: 10.3390/cancers16030570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 01/16/2024] [Accepted: 01/27/2024] [Indexed: 02/12/2024] Open
Abstract
Deep learning has become an essential tool in medical image analysis owing to its remarkable performance. Target classification and model interpretability are key applications of deep learning in medical image analysis, and hence many deep learning-based algorithms have emerged. Many existing deep learning-based algorithms include pooling operations, which are a type of subsampling used to enlarge the receptive field. However, pooling operations degrade the image details in terms of signal processing theory, which is significantly sensitive to small objects in an image. Therefore, in this study, we designed a Rense block and edge conservative module to effectively manipulate previous feature information in the feed-forward learning process. Specifically, a Rense block, an optimal design that incorporates skip connections of residual and dense blocks, was demonstrated through mathematical analysis. Furthermore, we avoid blurring of the features in the pooling operation through a compensation path in the edge conservative module. Two independent CT datasets of kidney stones and lung tumors, in which small lesions are often included in the images, were used to verify the proposed RenseNet. The results of the classification and explanation heatmaps show that the proposed RenseNet provides the best inference and interpretation compared to current state-of-the-art methods. The proposed RenseNet can significantly contribute to efficient diagnosis and treatment because it is effective for small lesions that might be misclassified or misinterpreted.
Collapse
Affiliation(s)
- Hyunseok Seo
- Bionics Research Center, Biomedical Research Division, Korea Institute of Science and Technology (KIST), Seoul 02792, Republic of Korea; (S.L.); (S.Y.); (S.L.); (S.S.)
| | - Seokjun Lee
- Bionics Research Center, Biomedical Research Division, Korea Institute of Science and Technology (KIST), Seoul 02792, Republic of Korea; (S.L.); (S.Y.); (S.L.); (S.S.)
| | - Sojin Yun
- Bionics Research Center, Biomedical Research Division, Korea Institute of Science and Technology (KIST), Seoul 02792, Republic of Korea; (S.L.); (S.Y.); (S.L.); (S.S.)
| | - Saebom Leem
- Bionics Research Center, Biomedical Research Division, Korea Institute of Science and Technology (KIST), Seoul 02792, Republic of Korea; (S.L.); (S.Y.); (S.L.); (S.S.)
| | - Seohee So
- Bionics Research Center, Biomedical Research Division, Korea Institute of Science and Technology (KIST), Seoul 02792, Republic of Korea; (S.L.); (S.Y.); (S.L.); (S.S.)
| | - Deok Hyun Han
- Department of Urology, Samsung Medical Center (SMC), Seoul 06351, Republic of Korea;
| |
Collapse
|
41
|
Liu L, Wu K, Wang K, Han Z, Qiu J, Zhan Q, Wu T, Xu J, Zeng Z. SEU 2-Net: multi-scale U 2-Net with SE attention mechanism for liver occupying lesion CT image segmentation. PeerJ Comput Sci 2024; 10:e1751. [PMID: 38435550 PMCID: PMC10909188 DOI: 10.7717/peerj-cs.1751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 11/22/2023] [Indexed: 03/05/2024]
Abstract
Liver occupying lesions can profoundly impact an individual's health and well-being. To assist physicians in the diagnosis and treatment of abnormal areas in the liver, we propose a novel network named SEU2-Net by introducing the channel attention mechanism into U2-Net for accurate and automatic liver occupying lesion segmentation. We design the Residual U-block with Squeeze-and-Excitation (SE-RSU), which is to add the Squeeze-and-Excitation (SE) attention mechanism at the residual connections of the Residual U-blocks (RSU, the component unit of U2-Net). SEU2-Net not only retains the advantages of U2-Net in capturing contextual information at multiple scales, but can also adaptively recalibrate channel feature responses to emphasize useful feature information according to the channel attention mechanism. In addition, we present a new abdominal CT dataset for liver occupying lesion segmentation from Peking University First Hospital's clinical data (PUFH dataset). We evaluate the proposed method and compare it with eight deep learning networks on the PUFH and the Liver Tumor Segmentation Challenge (LiTS) datasets. The experimental results show that SEU2-Net has state-of-the-art performance and good robustness in liver occupying lesions segmentation.
Collapse
Affiliation(s)
- Lizhuang Liu
- Shanghai Advanced Research Institute, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai, China
| | - Kun Wu
- Shanghai Advanced Research Institute, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai, China
| | - Ke Wang
- Radiology Department, Peking University First Hospital, Beijing, China
| | - Zhenqi Han
- Shanghai Advanced Research Institute, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai, China
| | - Jianxing Qiu
- Radiology Department, Peking University First Hospital, Beijing, China
| | - Qiao Zhan
- Department of Infectious Diseases, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Tian Wu
- Department of Infectious Diseases, Peking University First Hospital, Beijing, China
| | - Jinghang Xu
- Department of Infectious Diseases, Peking University First Hospital, Beijing, China
| | - Zheng Zeng
- Department of Infectious Diseases, Peking University First Hospital, Beijing, China
| |
Collapse
|
42
|
Yang S, Liang Y, Wu S, Sun P, Chen Z. SADSNet: A robust 3D synchronous segmentation network for liver and liver tumors based on spatial attention mechanism and deep supervision. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2024; 32:707-723. [PMID: 38552134 DOI: 10.3233/xst-230312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Highlights • Introduce a data augmentation strategy to expand the required different morphological data during the training and learning phase, and improve the algorithm's feature learning ability for complex and diverse tumor morphology CT images.• Design attention mechanisms for encoding and decoding paths to extract fine pixel level features, improve feature extraction capabilities, and achieve efficient spatial channel feature fusion.• The deep supervision layer is used to correct and decode the final image data to provide high accuracy of results.• The effectiveness of this method has been affirmed through validation on the LITS, 3DIRCADb, and SLIVER datasets. BACKGROUND Accurately extracting liver and liver tumors from medical images is an important step in lesion localization and diagnosis, surgical planning, and postoperative monitoring. However, the limited number of radiation therapists and a great number of images make this work time-consuming. OBJECTIVE This study designs a spatial attention deep supervised network (SADSNet) for simultaneous automatic segmentation of liver and tumors. METHOD Firstly, self-designed spatial attention modules are introduced at each layer of the encoder and decoder to extract image features at different scales and resolutions, helping the model better capture liver tumors and fine structures. The designed spatial attention module is implemented through two gate signals related to liver and tumors, as well as changing the size of convolutional kernels; Secondly, deep supervision is added behind the three layers of the decoder to assist the backbone network in feature learning and improve gradient propagation, enhancing robustness. RESULTS The method was testing on LITS, 3DIRCADb, and SLIVER datasets. For the liver, it obtained dice similarity coefficients of 97.03%, 96.11%, and 97.40%, surface dice of 81.98%, 82.53%, and 86.29%, 95% hausdorff distances of 8.96 mm, 8.26 mm, and 3.79 mm, and average surface distances of 1.54 mm, 1.19 mm, and 0.81 mm. Additionally, it also achieved precise tumor segmentation, which with dice scores of 87.81% and 87.50%, surface dice of 89.63% and 84.26%, 95% hausdorff distance of 12.96 mm and 16.55 mm, and average surface distances of 1.11 mm and 3.04 mm on LITS and 3DIRCADb, respectively. CONCLUSION The experimental results show that the proposed method is effective and superior to some other methods. Therefore, this method can provide technical support for liver and liver tumor segmentation in clinical practice.
Collapse
Affiliation(s)
- Sijing Yang
- School of Life and Environmental Science, Guilin University of Electronic Technology, Guilin, China
| | - Yongbo Liang
- School of Life and Environmental Science, Guilin University of Electronic Technology, Guilin, China
| | - Shang Wu
- School of Life and Environmental Science, Guilin University of Electronic Technology, Guilin, China
| | - Peng Sun
- School of Electronic Engineering and Automation, Guilin University of Electronic Technology, Guilin, China
| | - Zhencheng Chen
- School of Life and Environmental Science, Guilin University of Electronic Technology, Guilin, China
- School of Electronic Engineering and Automation, Guilin University of Electronic Technology, Guilin, China
- Guangxi Colleges and Universities Key Laboratory of Biomedical Sensors and Intelligent Instruments, Guilin, China
- Guangxi Engineering Technology Research Center of Human Physiological Information Noninvasive Detection, Guilin, China
| |
Collapse
|
43
|
Xie Y, Zhang J, Xia Y, Shen C. Learning From Partially Labeled Data for Multi-Organ and Tumor Segmentation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:14905-14919. [PMID: 37672381 DOI: 10.1109/tpami.2023.3312587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
Medical image benchmarks for the segmentation of organs and tumors suffer from the partially labeling issue due to its intensive cost of labor and expertise. Current mainstream approaches follow the practice of one network solving one task. With this pipeline, not only the performance is limited by the typically small dataset of a single task, but also the computation cost linearly increases with the number of tasks. To address this, we propose a Transformer based dynamic on-demand network (TransDoDNet) that learns to segment organs and tumors on multiple partially labeled datasets. Specifically, TransDoDNet has a hybrid backbone that is composed of the convolutional neural network and Transformer. A dynamic head enables the network to accomplish multiple segmentation tasks flexibly. Unlike existing approaches that fix kernels after training, the kernels in the dynamic head are generated adaptively by the Transformer, which employs the self-attention mechanism to model long-range organ-wise dependencies and decodes the organ embedding that can represent each organ. We create a large-scale partially labeled Multi-Organ and Tumor Segmentation benchmark, termed MOTS, and demonstrate the superior performance of our TransDoDNet over other competitors on seven organ and tumor segmentation tasks. This study also provides a general 3D medical image segmentation model, which has been pre-trained on the large-scale MOTS benchmark and has demonstrated advanced performance over current predominant self-supervised learning methods.
Collapse
|
44
|
Ikuta M, Zhang J. A Deep Convolutional Gated Recurrent Unit for CT Image Reconstruction. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:10612-10625. [PMID: 35522637 DOI: 10.1109/tnnls.2022.3169569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Computed tomography (CT) is one of the most important medical imaging technologies in use today. Most commercial CT products use a technique known as the filtered backprojection (FBP) that is fast and can produce decent image quality when an X-ray dose is high. However, the FBP is not good enough on low-dose X-ray CT imaging because the CT image reconstruction problem becomes more stochastic. A more effective reconstruction technique proposed recently and implemented in a limited number of CT commercial products is an iterative reconstruction (IR). The IR technique is based on a Bayesian formulation of the CT image reconstruction problem with an explicit model of the CT scanning, including its stochastic nature, and a prior model that incorporates our knowledge about what a good CT image should look like. However, constructing such prior knowledge is more complicated than it seems. In this article, we propose a novel neural network for CT image reconstruction. The network is based on the IR formulation and constructed with a recurrent neural network (RNN). Specifically, we transform the gated recurrent unit (GRU) into a neural network performing CT image reconstruction. We call it "GRU reconstruction." This neural network conducts concurrent dual-domain learning. Many deep learning (DL)-based methods in medical imaging are single-domain learning, but dual-domain learning performs better because it learns from both the sinogram and the image domain. In addition, we propose backpropagation through stage (BPTS) as a new RNN backpropagation algorithm. It is similar to the backpropagation through time (BPTT) of an RNN; however, it is tailored for iterative optimization. Results from extensive experiments indicate that our proposed method outperforms conventional model-based methods, single-domain DL methods, and state-of-the-art DL techniques in terms of the root mean squared error (RMSE), the peak signal-to-noise ratio (PSNR), and the structure similarity (SSIM) and in terms of visual appearance.
Collapse
|
45
|
Chen Z, Hua S, Gao J, Chen Y, Gong Y, Shen Y, Tang X, Emu Y, Jin W, Hu C. A dual-stage partially interpretable neural network for joint suppression of bSSFP banding and flow artifacts in non-phase-cycled cine imaging. J Cardiovasc Magn Reson 2023; 25:68. [PMID: 37993824 PMCID: PMC10666342 DOI: 10.1186/s12968-023-00988-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2023] [Accepted: 11/12/2023] [Indexed: 11/24/2023] Open
Abstract
PURPOSE To develop a partially interpretable neural network for joint suppression of banding and flow artifacts in non-phase-cycled bSSFP cine imaging. METHODS A dual-stage neural network consisting of a voxel-identification (VI) sub-network and artifact-suppression (AS) sub-network is proposed. The VI sub-network provides identification of artifacts, which guides artifact suppression and improves interpretability. The AS sub-network reduces banding and flow artifacts. Short-axis cine images of 12 frequency offsets from 28 healthy subjects were used to train and test the dual-stage network. An additional 77 patients were retrospectively enrolled to evaluate its clinical generalizability. For healthy subjects, artifact suppression performance was analyzed by comparison with traditional phase cycling. The partial interpretability provided by the VI sub-network was analyzed via correlation analysis. Generalizability was evaluated for cine obtained with different sequence parameters and scanners. For patients, artifact suppression performance and partial interpretability of the network were qualitatively evaluated by 3 clinicians. Cardiac function before and after artifact suppression was assessed via left ventricular ejection fraction (LVEF). RESULTS For the healthy subjects, visual inspection and quantitative analysis found a considerable reduction of banding and flow artifacts by the proposed network. Compared with traditional phase cycling, the proposed network improved flow artifact scores (4.57 ± 0.23 vs 3.40 ± 0.38, P = 0.002) and overall image quality (4.33 ± 0.22 vs 3.60 ± 0.38, P = 0.002). The VI sub-network well identified the location of banding and flow artifacts in the original movie and significantly correlated with the change of signal intensities in these regions. Changes of imaging parameters or the scanner did not cause a significant change of overall image quality relative to the baseline dataset, suggesting a good generalizability. For the patients, qualitative analysis showed a significant improvement of banding artifacts (4.01 ± 0.50 vs 2.77 ± 0.40, P < 0.001), flow artifacts (4.22 ± 0.38 vs 2.97 ± 0.57, P < 0.001), and image quality (3.91 ± 0.45 vs 2.60 ± 0.43, P < 0.001) relative to the original cine. The artifact suppression slightly reduced the LVEF (mean bias = -1.25%, P = 0.01). CONCLUSIONS The dual-stage network simultaneously reduces banding and flow artifacts in bSSFP cine imaging with a partial interpretability, sparing the need for sequence modification. The method can be easily deployed in a clinical setting to identify artifacts and improve cine image quality.
Collapse
Affiliation(s)
- Zhuo Chen
- National Engineering Research Center of Advanced Magnetic Resonance Technologies for Diagnosis and Therapy, School of Biomedical Engineering, Shanghai Jiao Tong University, 415 S Med-X Center, 1954 Huashan Road, Shanghai, 200030, China
| | - Sha Hua
- Department of Cardiovascular Medicine, Heart Failure Center, Ruijin Hospital Lu Wan Branch, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Juan Gao
- National Engineering Research Center of Advanced Magnetic Resonance Technologies for Diagnosis and Therapy, School of Biomedical Engineering, Shanghai Jiao Tong University, 415 S Med-X Center, 1954 Huashan Road, Shanghai, 200030, China
| | - Yanjia Chen
- Department of Cardiovascular Medicine, Heart Failure Center, Ruijin Hospital Lu Wan Branch, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Yiwen Gong
- Department of Cardiovascular Medicine, Heart Failure Center, Ruijin Hospital Lu Wan Branch, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Yiwen Shen
- Department of Cardiovascular Medicine, Heart Failure Center, Ruijin Hospital Lu Wan Branch, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Xin Tang
- National Engineering Research Center of Advanced Magnetic Resonance Technologies for Diagnosis and Therapy, School of Biomedical Engineering, Shanghai Jiao Tong University, 415 S Med-X Center, 1954 Huashan Road, Shanghai, 200030, China
| | - Yixin Emu
- National Engineering Research Center of Advanced Magnetic Resonance Technologies for Diagnosis and Therapy, School of Biomedical Engineering, Shanghai Jiao Tong University, 415 S Med-X Center, 1954 Huashan Road, Shanghai, 200030, China
| | - Wei Jin
- Department of Cardiovascular Medicine, Heart Failure Center, Ruijin Hospital Lu Wan Branch, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Chenxi Hu
- National Engineering Research Center of Advanced Magnetic Resonance Technologies for Diagnosis and Therapy, School of Biomedical Engineering, Shanghai Jiao Tong University, 415 S Med-X Center, 1954 Huashan Road, Shanghai, 200030, China.
| |
Collapse
|
46
|
Chen Y, Yu L, Wang JY, Panjwani N, Obeid JP, Liu W, Liu L, Kovalchuk N, Gensheimer MF, Vitzthum LK, Beadle BM, Chang DT, Le QT, Han B, Xing L. Adaptive Region-Specific Loss for Improved Medical Image Segmentation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:13408-13421. [PMID: 37363838 PMCID: PMC11346301 DOI: 10.1109/tpami.2023.3289667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/28/2023]
Abstract
Defining the loss function is an important part of neural network design and critically determines the success of deep learning modeling. A significant shortcoming of the conventional loss functions is that they weight all regions in the input image volume equally, despite the fact that the system is known to be heterogeneous (i.e., some regions can achieve high prediction performance more easily than others). Here, we introduce a region-specific loss to lift the implicit assumption of homogeneous weighting for better learning. We divide the entire volume into multiple sub-regions, each with an individualized loss constructed for optimal local performance. Effectively, this scheme imposes higher weightings on the sub-regions that are more difficult to segment, and vice versa. Furthermore, the regional false positive and false negative errors are computed for each input image during a training step and the regional penalty is adjusted accordingly to enhance the overall accuracy of the prediction. Using different public and in-house medical image datasets, we demonstrate that the proposed regionally adaptive loss paradigm outperforms conventional methods in the multi-organ segmentations, without any modification to the neural network architecture or additional data preparation.
Collapse
|
47
|
Li C, Bagher-Ebadian H, Sultan RI, Elshaikh M, Movsas B, Zhu D, Chetty IJ. A new architecture combining convolutional and transformer-based networks for automatic 3D multi-organ segmentation on CT images. Med Phys 2023; 50:6990-7002. [PMID: 37738468 DOI: 10.1002/mp.16750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 08/08/2023] [Accepted: 08/13/2023] [Indexed: 09/24/2023] Open
Abstract
PURPOSE Deep learning-based networks have become increasingly popular in the field of medical image segmentation. The purpose of this research was to develop and optimize a new architecture for automatic segmentation of the prostate gland and normal organs in the pelvic, thoracic, and upper gastro-intestinal (GI) regions. METHODS We developed an architecture which combines a shifted-window (Swin) transformer with a convolutional U-Net. The network includes a parallel encoder, a cross-fusion block, and a CNN-based decoder to extract local and global information and merge related features on the same scale. A skip connection is applied between the cross-fusion block and decoder to integrate low-level semantic features. Attention gates (AGs) are integrated within the CNN to suppress features in image background regions. Our network is termed "SwinAttUNet." We optimized the architecture for automatic image segmentation. Training datasets consisted of planning-CT datasets from 300 prostate cancer patients from an institutional database and 100 CT datasets from a publicly available dataset (CT-ORG). Images were linearly interpolated and resampled to a spatial resolution of (1.0 × 1.0× 1.5) mm3 . A volume patch (192 × 192 × 96) was used for training and inference, and the dataset was split into training (75%), validation (10%), and test (15%) cohorts. Data augmentation transforms were applied consisting of random flip, rotation, and intensity scaling. The loss function comprised Dice and cross-entropy equally weighted and summed. We evaluated Dice coefficients (DSC), 95th percentile Hausdorff Distances (HD95), and Average Surface Distances (ASD) between results of our network and ground truth data. RESULTS SwinAttUNet, DSC values were 86.54 ± 1.21, 94.15 ± 1.17, and 87.15 ± 1.68% and HD95 values were 5.06 ± 1.42, 3.16 ± 0.93, and 5.54 ± 1.63 mm for the prostate, bladder, and rectum, respectively. Respective ASD values were 1.45 ± 0.57, 0.82 ± 0.12, and 1.42 ± 0.38 mm. For the lung, liver, kidneys and pelvic bones, respective DSC values were: 97.90 ± 0.80, 96.16 ± 0.76, 93.74 ± 2.25, and 89.31 ± 3.87%. Respective HD95 values were: 5.13 ± 4.11, 2.73 ± 1.19, 2.29 ± 1.47, and 5.31 ± 1.25 mm. Respective ASD values were: 1.88 ± 1.45, 1.78 ± 1.21, 0.71 ± 0.43, and 1.21 ± 1.11 mm. Our network outperformed several existing deep learning approaches using only attention-based convolutional or Transformer-based feature strategies, as detailed in the results section. CONCLUSIONS We have demonstrated that our new architecture combining Transformer- and convolution-based features is able to better learn the local and global context for automatic segmentation of multi-organ, CT-based anatomy.
Collapse
Affiliation(s)
- Chengyin Li
- College of Engineering - Dept. of Computer Science, Wayne State University, Detroit, Michigan, USA
| | - Hassan Bagher-Ebadian
- Department of Radiation Oncology, Henry Ford Cancer Institute, Detroit, Michigan, USA
- Department of Radiology, Michigan State University, East Lansing, Michigan, USA
- Department of Osteopathic Medicine, Michigan State University, East Lansing, Michigan, USA
- Department of Physics, Oakland University, Rochester, Michigan, USA
| | - Rafi Ibn Sultan
- College of Engineering - Dept. of Computer Science, Wayne State University, Detroit, Michigan, USA
| | - Mohamed Elshaikh
- Department of Radiation Oncology, Henry Ford Cancer Institute, Detroit, Michigan, USA
| | - Benjamin Movsas
- Department of Radiation Oncology, Henry Ford Cancer Institute, Detroit, Michigan, USA
| | - Dongxiao Zhu
- College of Engineering - Dept. of Computer Science, Wayne State University, Detroit, Michigan, USA
| | - Indrin J Chetty
- Department of Radiation Oncology, Henry Ford Cancer Institute, Detroit, Michigan, USA
- Department of Radiation Oncology, Cedars Sinai Medical Center, Los Angeles, CA, USA
| |
Collapse
|
48
|
Tian M, Wang H, Liu X, Ye Y, Ouyang G, Shen Y, Li Z, Wang X, Wu S. Delineation of clinical target volume and organs at risk in cervical cancer radiotherapy by deep learning networks. Med Phys 2023; 50:6354-6365. [PMID: 37246619 DOI: 10.1002/mp.16468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2022] [Revised: 04/17/2023] [Accepted: 04/28/2023] [Indexed: 05/30/2023] Open
Abstract
PURPOSE Delineation of the clinical target volume (CTV) and organs-at-risk (OARs) is important in cervical cancer radiotherapy. But it is generally labor-intensive, time-consuming, and subjective. This paper proposes a parallel-path attention fusion network (PPAF-net) to overcome these disadvantages in the delineation task. METHODS The PPAF-net utilizes both the texture and structure information of CTV and OARs by employing a U-Net network to capture the high-level texture information, and an up-sampling and down-sampling (USDS) network to capture the low-level structure information to accentuate the boundaries of CTV and OARs. Multi-level features extracted from both networks are then fused together through an attention module to generate the delineation result. RESULTS The dataset contains 276 computed tomography (CT) scans of patients with cervical cancer of staging IB-IIA. The images are provided by the West China Hospital of Sichuan University. Simulation results demonstrate that PPAF-net performs favorably on the delineation of the CTV and OARs (e.g., rectum, bladder and etc.) and achieves the state-of-the-art delineation accuracy, respectively, for the CTV and OARs. In terms of the Dice Similarity Coefficient (DSC) and the Hausdorff Distance (HD), 88.61% and 2.25 cm for the CTV, 92.27% and 0.73 cm for the rectum, 96.74% and 0.68 cm for the bladder, 96.38% and 0.65 cm for the left kidney, 96.79% and 0.63 cm for the right kidney, 93.42% and 0.52 cm for the left femoral head, 93.69% and 0.51 cm for the right femoral head, 87.53% and 1.07 cm for the small intestine, and 91.50% and 0.84 cm for the spinal cord. CONCLUSIONS The proposed automatic delineation network PPAF-net performs well on CTV and OARs segmentation tasks, which has great potential for reducing the burden of radiation oncologists and increasing the accuracy of delineation. In future, radiation oncologists from the West China Hospital of Sichuan University will further evaluate the results of network delineation, making this method helpful in clinical practice.
Collapse
Affiliation(s)
- Miao Tian
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Hongqiu Wang
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Xingang Liu
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Yuyun Ye
- Department of Electrical and Computer Engineering, University of Tulsa, Tulsa, USA
| | - Ganlu Ouyang
- Department of Radiation Oncology, Cancer Center, the West China Hospital of Sichuan University, Chengdu, China
| | - Yali Shen
- Department of Radiation Oncology, Cancer Center, the West China Hospital of Sichuan University, Chengdu, China
| | - Zhiping Li
- Department of Radiation Oncology, Cancer Center, the West China Hospital of Sichuan University, Chengdu, China
| | - Xin Wang
- Department of Radiation Oncology, Cancer Center, the West China Hospital of Sichuan University, Chengdu, China
| | - Shaozhi Wu
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| |
Collapse
|
49
|
Radiya K, Joakimsen HL, Mikalsen KØ, Aahlin EK, Lindsetmo RO, Mortensen KE. Performance and clinical applicability of machine learning in liver computed tomography imaging: a systematic review. Eur Radiol 2023; 33:6689-6717. [PMID: 37171491 PMCID: PMC10511359 DOI: 10.1007/s00330-023-09609-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 02/02/2023] [Accepted: 02/06/2023] [Indexed: 05/13/2023]
Abstract
OBJECTIVES Machine learning (ML) for medical imaging is emerging for several organs and image modalities. Our objectives were to provide clinicians with an overview of this field by answering the following questions: (1) How is ML applied in liver computed tomography (CT) imaging? (2) How well do ML systems perform in liver CT imaging? (3) What are the clinical applications of ML in liver CT imaging? METHODS A systematic review was carried out according to the guidelines from the PRISMA-P statement. The search string focused on studies containing content relating to artificial intelligence, liver, and computed tomography. RESULTS One hundred ninety-one studies were included in the study. ML was applied to CT liver imaging by image analysis without clinicians' intervention in majority of studies while in newer studies the fusion of ML method with clinical intervention have been identified. Several were documented to perform very accurately on reliable but small data. Most models identified were deep learning-based, mainly using convolutional neural networks. Potentially many clinical applications of ML to CT liver imaging have been identified through our review including liver and its lesion segmentation and classification, segmentation of vascular structure inside the liver, fibrosis and cirrhosis staging, metastasis prediction, and evaluation of chemotherapy. CONCLUSION Several studies attempted to provide transparent result of the model. To make the model convenient for a clinical application, prospective clinical validation studies are in urgent call. Computer scientists and engineers should seek to cooperate with health professionals to ensure this. KEY POINTS • ML shows great potential for CT liver image tasks such as pixel-wise segmentation and classification of liver and liver lesions, fibrosis staging, metastasis prediction, and retrieval of relevant liver lesions from similar cases of other patients. • Despite presenting the result is not standardized, many studies have attempted to provide transparent results to interpret the machine learning method performance in the literature. • Prospective studies are in urgent call for clinical validation of ML method, preferably carried out by cooperation between clinicians and computer scientists.
Collapse
Affiliation(s)
- Keyur Radiya
- Department of Gastroenterological Surgery at University Hospital of North Norway (UNN), Tromso, Norway.
- Department of Clinical Medicine, UiT The Arctic University of Norway, Tromso, Norway.
| | - Henrik Lykke Joakimsen
- Institute of Clinical Medicine, UiT The Arctic University of Norway, Tromso, Norway
- Centre for Clinical Artificial Intelligence (SPKI), University Hospital of North Norway, Tromso, Norway
| | - Karl Øyvind Mikalsen
- Department of Clinical Medicine, UiT The Arctic University of Norway, Tromso, Norway
- Centre for Clinical Artificial Intelligence (SPKI), University Hospital of North Norway, Tromso, Norway
- UiT Machine Learning Group, Department of Physics and Technology, UiT the Arctic University of Norway, Tromso, Norway
| | - Eirik Kjus Aahlin
- Department of Gastroenterological Surgery at University Hospital of North Norway (UNN), Tromso, Norway
| | - Rolv-Ole Lindsetmo
- Department of Clinical Medicine, UiT The Arctic University of Norway, Tromso, Norway
- Head Clinic of Surgery, Oncology and Women Health, University Hospital of North Norway, Tromso, Norway
| | - Kim Erlend Mortensen
- Department of Gastroenterological Surgery at University Hospital of North Norway (UNN), Tromso, Norway
- Department of Clinical Medicine, UiT The Arctic University of Norway, Tromso, Norway
| |
Collapse
|
50
|
Chen Y, Gensheimer MF, Bagshaw HP, Butler S, Yu L, Zhou Y, Shen L, Kovalchuk N, Surucu M, Chang DT, Xing L, Han B. Patient-Specific Auto-segmentation on Daily kVCT Images for Adaptive Radiation Therapy. Int J Radiat Oncol Biol Phys 2023; 117:505-514. [PMID: 37141982 DOI: 10.1016/j.ijrobp.2023.04.026] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Revised: 04/18/2023] [Accepted: 04/25/2023] [Indexed: 05/06/2023]
Abstract
PURPOSE This study explored deep-learning-based patient-specific auto-segmentation using transfer learning on daily RefleXion kilovoltage computed tomography (kVCT) images to facilitate adaptive radiation therapy, based on data from the first group of patients treated with the innovative RefleXion system. METHODS AND MATERIALS For head and neck (HaN) and pelvic cancers, a deep convolutional segmentation network was initially trained on a population data set that contained 67 and 56 patient cases, respectively. Then the pretrained population network was adapted to the specific RefleXion patient by fine-tuning the network weights with a transfer learning method. For each of the 6 collected RefleXion HaN cases and 4 pelvic cases, initial planning computed tomography (CT) scans and 5 to 26 sets of daily kVCT images were used for the patient-specific learning and evaluation separately. The performance of the patient-specific network was compared with the population network and the clinical rigid registration method and evaluated by the Dice similarity coefficient (DSC) with manual contours being the reference. The corresponding dosimetric effects resulting from different auto-segmentation and registration methods were also investigated. RESULTS The proposed patient-specific network achieved mean DSC results of 0.88 for 3 HaN organs at risk (OARs) of interest and 0.90 for 8 pelvic target and OARs, outperforming the population network (0.70 and 0.63) and the registration method (0.72 and 0.72). The DSC of the patient-specific network gradually increased with the increment of longitudinal training cases and approached saturation with more than 6 training cases. Compared with using the registration contour, the target and OAR mean doses and dose-volume histograms obtained using the patient-specific auto-segmentation were closer to the results using the manual contour. CONCLUSIONS Auto-segmentation of RefleXion kVCT images based on the patient-specific transfer learning could achieve higher accuracy, outperforming a common population network and clinical registration-based method. This approach shows promise in improving dose evaluation accuracy in RefleXion adaptive radiation therapy.
Collapse
Affiliation(s)
- Yizheng Chen
- Department of Radiation Oncology, Stanford University, Stanford, California
| | | | - Hilary P Bagshaw
- Department of Radiation Oncology, Stanford University, Stanford, California
| | - Santino Butler
- Department of Radiation Oncology, Stanford University, Stanford, California
| | - Lequan Yu
- Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong, China
| | - Yuyin Zhou
- Department of Computer Science and Engineering, University of California Santa Cruz, Santa Cruz, California
| | - Liyue Shen
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts
| | - Nataliya Kovalchuk
- Department of Radiation Oncology, Stanford University, Stanford, California
| | - Murat Surucu
- Department of Radiation Oncology, Stanford University, Stanford, California
| | - Daniel T Chang
- Department of Radiation Oncology, University of Michigan, Ann Arbor, Michigan
| | - Lei Xing
- Department of Radiation Oncology, Stanford University, Stanford, California
| | - Bin Han
- Department of Radiation Oncology, Stanford University, Stanford, California.
| |
Collapse
|