1
|
Murugesan GK, McCrumb D, Aboian M, Verma T, Soni R, Memon F, Farahani K, Pei L, Wagner U, Fedorov AY, Clunie D, Moore S, Van Oss J. AI-Generated Annotations Dataset for Diverse Cancer Radiology Collections in NCI Image Data Commons. Sci Data 2024; 11:1165. [PMID: 39443503 PMCID: PMC11500357 DOI: 10.1038/s41597-024-03977-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 10/07/2024] [Indexed: 10/25/2024] Open
Abstract
The National Cancer Institute (NCI) Image Data Commons (IDC) offers publicly available cancer radiology collections for cloud computing, crucial for developing advanced imaging tools and algorithms. Despite their potential, these collections are minimally annotated; only 4% of DICOM studies in collections considered in the project had existing segmentation annotations. This project increases the quantity of segmentations in various IDC collections. We produced high-quality, AI-generated imaging annotations dataset of tissues, organs, and/or cancers for 11 distinct IDC image collections. These collections contain images from a variety of modalities, including computed tomography (CT), magnetic resonance imaging (MRI), and positron emission tomography (PET). The collections cover various body parts, such as the chest, breast, kidneys, prostate, and liver. A portion of the AI annotations were reviewed and corrected by a radiologist to assess the performance of the AI models. Both the AI's and the radiologist's annotations were encoded in conformance to the Digital Imaging and Communications in Medicine (DICOM) standard, allowing for seamless integration into the IDC collections as third-party analysis collections. All the models, images and annotations are publicly accessible.
Collapse
Affiliation(s)
| | | | | | - Tej Verma
- Yale School of Medicine, New Haven, CT, USA
| | | | | | | | - Linmin Pei
- Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Ulrike Wagner
- Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Andrey Y Fedorov
- Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | | | | | | |
Collapse
|
2
|
Lei W, Xu W, Li K, Zhang X, Zhang S. MedLSAM: Localize and segment anything model for 3D CT images. Med Image Anal 2024; 99:103370. [PMID: 39447436 DOI: 10.1016/j.media.2024.103370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 09/09/2024] [Accepted: 10/09/2024] [Indexed: 10/26/2024]
Abstract
Recent advancements in foundation models have shown significant potential in medical image analysis. However, there is still a gap in models specifically designed for medical image localization. To address this, we introduce MedLAM, a 3D medical foundation localization model that accurately identifies any anatomical part within the body using only a few template scans. MedLAM employs two self-supervision tasks: unified anatomical mapping (UAM) and multi-scale similarity (MSS) across a comprehensive dataset of 14,012 CT scans. Furthermore, we developed MedLSAM by integrating MedLAM with the Segment Anything Model (SAM). This innovative framework requires extreme point annotations across three directions on several templates to enable MedLAM to locate the target anatomical structure in the image, with SAM performing the segmentation. It significantly reduces the amount of manual annotation required by SAM in 3D medical imaging scenarios. We conducted extensive experiments on two 3D datasets covering 38 distinct organs. Our findings are twofold: (1) MedLAM can directly localize anatomical structures using just a few template scans, achieving performance comparable to fully supervised models; (2) MedLSAM closely matches the performance of SAM and its specialized medical adaptations with manual prompts, while minimizing the need for extensive point annotations across the entire dataset. Moreover, MedLAM has the potential to be seamlessly integrated with future 3D SAM models, paving the way for enhanced segmentation performance. Our code is public at https://github.com/openmedlab/MedLSAM.
Collapse
Affiliation(s)
- Wenhui Lei
- School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China; Shanghai AI Lab, Shanghai, China
| | - Wei Xu
- School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China; West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China
| | - Kang Li
- Shanghai AI Lab, Shanghai, China; West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China
| | - Xiaofan Zhang
- School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China; Shanghai AI Lab, Shanghai, China.
| | - Shaoting Zhang
- School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China; Shanghai AI Lab, Shanghai, China
| |
Collapse
|
3
|
Li W, Qu C, Chen X, Bassi PRAS, Shi Y, Lai Y, Yu Q, Xue H, Chen Y, Lin X, Tang Y, Cao Y, Han H, Zhang Z, Liu J, Zhang T, Ma Y, Wang J, Zhang G, Yuille A, Zhou Z. AbdomenAtlas: A large-scale, detailed-annotated, & multi-center dataset for efficient transfer learning and open algorithmic benchmarking. Med Image Anal 2024; 97:103285. [PMID: 39116766 DOI: 10.1016/j.media.2024.103285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 07/18/2024] [Accepted: 07/19/2024] [Indexed: 08/10/2024]
Abstract
We introduce the largest abdominal CT dataset (termed AbdomenAtlas) of 20,460 three-dimensional CT volumes sourced from 112 hospitals across diverse populations, geographies, and facilities. AbdomenAtlas provides 673 K high-quality masks of anatomical structures in the abdominal region annotated by a team of 10 radiologists with the help of AI algorithms. We start by having expert radiologists manually annotate 22 anatomical structures in 5,246 CT volumes. Following this, a semi-automatic annotation procedure is performed for the remaining CT volumes, where radiologists revise the annotations predicted by AI, and in turn, AI improves its predictions by learning from revised annotations. Such a large-scale, detailed-annotated, and multi-center dataset is needed for two reasons. Firstly, AbdomenAtlas provides important resources for AI development at scale, branded as large pre-trained models, which can alleviate the annotation workload of expert radiologists to transfer to broader clinical applications. Secondly, AbdomenAtlas establishes a large-scale benchmark for evaluating AI algorithms-the more data we use to test the algorithms, the better we can guarantee reliable performance in complex clinical scenarios. An ISBI & MICCAI challenge named BodyMaps: Towards 3D Atlas of Human Body was launched using a subset of our AbdomenAtlas, aiming to stimulate AI innovation and to benchmark segmentation accuracy, inference efficiency, and domain generalizability. We hope our AbdomenAtlas can set the stage for larger-scale clinical trials and offer exceptional opportunities to practitioners in the medical imaging community. Codes, models, and datasets are available at https://www.zongweiz.com/dataset.
Collapse
Affiliation(s)
- Wenxuan Li
- Department of Computer Science, Johns Hopkins University, United States of America
| | - Chongyu Qu
- Department of Computer Science, Johns Hopkins University, United States of America
| | - Xiaoxi Chen
- Department of Bioengineering, University of Illinois Urbana-Champaign, United States of America
| | - Pedro R A S Bassi
- Department of Computer Science, Johns Hopkins University, United States of America; Alma Mater Studiorum - University of Bologna, Italy; Center for Biomolecular Nanotechnologies, Istituto Italiano di Tecnologia, Italy
| | - Yijia Shi
- LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Yuxiang Lai
- Department of Computer Science, Johns Hopkins University, United States of America; Department of Computer Science, Southeast University, China
| | - Qian Yu
- Department of Radiology, Southeast University Zhongda Hospital, China
| | - Huimin Xue
- Department of Medical Oncology, The First Hospital of China Medical University, China
| | - Yixiong Chen
- Department of Computer Science, Johns Hopkins University, United States of America
| | - Xiaorui Lin
- The Second Clinical College, China Medical University, China
| | - Yutong Tang
- The Second Clinical College, China Medical University, China
| | - Yining Cao
- The Second Clinical College, China Medical University, China
| | - Haoqi Han
- The Second Clinical College, China Medical University, China
| | - Zheyuan Zhang
- Department of Mechanical Engineering and the Laboratory of Computational Sensing and Robotics, Johns Hopkins University, United States of America
| | - Jiawei Liu
- Department of Mechanical Engineering and the Laboratory of Computational Sensing and Robotics, Johns Hopkins University, United States of America
| | - Tiezheng Zhang
- Department of Computer Science, Johns Hopkins University, United States of America
| | - Yujiu Ma
- Center of Reproductive Medicine, Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, China
| | - Jincheng Wang
- Radiology Department, the First Affiliated Hospital, School of Medicine, Zhejiang University, China
| | - Guang Zhang
- Department of Health Management, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, China; Shandong Engineering Research Center of Health Management, China; Shandong Institute of Health Management, China
| | - Alan Yuille
- Department of Computer Science, Johns Hopkins University, United States of America
| | - Zongwei Zhou
- Department of Computer Science, Johns Hopkins University, United States of America.
| |
Collapse
|
4
|
Chen Y, Gao Y, Zhu L, Shao W, Lu Y, Han H, Xie Z. PCNet: Prior Category Network for CT Universal Segmentation Model. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:3319-3330. [PMID: 38687654 DOI: 10.1109/tmi.2024.3395349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2024]
Abstract
Accurate segmentation of anatomical structures in Computed Tomography (CT) images is crucial for clinical diagnosis, treatment planning, and disease monitoring. The present deep learning segmentation methods are hindered by factors such as data scale and model size. Inspired by how doctors identify tissues, we propose a novel approach, the Prior Category Network (PCNet), that boosts segmentation performance by leveraging prior knowledge between different categories of anatomical structures. Our PCNet comprises three key components: prior category prompt (PCP), hierarchy category system (HCS), and hierarchy category loss (HCL). PCP utilizes Contrastive Language-Image Pretraining (CLIP), along with attention modules, to systematically define the relationships between anatomical categories as identified by clinicians. HCS guides the segmentation model in distinguishing between specific organs, anatomical structures, and functional systems through hierarchical relationships. HCL serves as a consistency constraint, fortifying the directional guidance provided by HCS to enhance the segmentation model's accuracy and robustness. We conducted extensive experiments to validate the effectiveness of our approach, and the results indicate that PCNet can generate a high-performance, universal model for CT segmentation. The PCNet framework also demonstrates a significant transferability on multiple downstream tasks. The ablation experiments show that the methodology employed in constructing the HCS is of critical importance. The prompt and HCS can be accessed at https://github.com/PKU-MIPET/PCNet.
Collapse
|
5
|
Kumar K, Yeo AU, McIntosh L, Kron T, Wheeler G, Franich RD. Deep Learning Auto-Segmentation Network for Pediatric Computed Tomography Data Sets: Can We Extrapolate From Adults? Int J Radiat Oncol Biol Phys 2024; 119:1297-1306. [PMID: 38246249 DOI: 10.1016/j.ijrobp.2024.01.201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Revised: 12/10/2023] [Accepted: 01/07/2024] [Indexed: 01/23/2024]
Abstract
PURPOSE Artificial intelligence (AI)-based auto-segmentation models hold promise for enhanced efficiency and consistency in organ contouring for adaptive radiation therapy and radiation therapy planning. However, their performance on pediatric computed tomography (CT) data and cross-scanner compatibility remain unclear. This study aimed to evaluate the performance of AI-based auto-segmentation models trained on adult CT data when applied to pediatric data sets and explore the improvement in performance gained by including pediatric training data. It also examined their ability to accurately segment CT data acquired from different scanners. METHODS AND MATERIALS Using the nnU-Net framework, segmentation models were trained on data sets of adult, pediatric, and combined CT scans for 7 pelvic/thoracic organs. Each model was trained on 290 to 300 cases per category and organ. Training data sets included a combination of clinical data and several open repositories. The study incorporated a database of 459 pediatric (0-16 years) CT scans and 950 adults (>18 years), ensuring all scans had human expert ground-truth contours of the selected organs. Performance was evaluated based on Dice similarity coefficients (DSC) of the model-generated contours. RESULTS AI models trained exclusively on adult data underperformed on pediatric data, especially for the 0 to 2 age group: mean DSC was below 0.5 for the bladder and spleen. The addition of pediatric training data demonstrated significant improvement for all age groups, achieving a mean DSC of above 0.85 for all organs in every age group. Larger organs like the liver and kidneys maintained consistent performance for all models across age groups. No significant difference emerged in the cross-scanner performance evaluation, suggesting robust cross-scanner generalization. CONCLUSIONS For optimal segmentation across age groups, it is important to include pediatric data in the training of segmentation models. The successful cross-scanner generalization also supports the real-world clinical applicability of these AI models. This study emphasizes the significance of data set diversity in training robust AI systems for medical image interpretation tasks.
Collapse
Affiliation(s)
- Kartik Kumar
- Physical Sciences Department, Peter MacCallum Cancer Centre, Victoria, Australia; School of Science, RMIT University, Melbourne, Victoria, Australia
| | - Adam U Yeo
- Physical Sciences Department, Peter MacCallum Cancer Centre, Victoria, Australia; School of Science, RMIT University, Melbourne, Victoria, Australia; Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria, Australia
| | - Lachlan McIntosh
- Physical Sciences Department, Peter MacCallum Cancer Centre, Victoria, Australia; School of Science, RMIT University, Melbourne, Victoria, Australia
| | - Tomas Kron
- Physical Sciences Department, Peter MacCallum Cancer Centre, Victoria, Australia; School of Science, RMIT University, Melbourne, Victoria, Australia; Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria, Australia; Centre for Medical Radiation Physics, University of Wollongong, Wollongong, New South Wales, Australia
| | - Greg Wheeler
- Physical Sciences Department, Peter MacCallum Cancer Centre, Victoria, Australia; Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria, Australia
| | - Rick D Franich
- Physical Sciences Department, Peter MacCallum Cancer Centre, Victoria, Australia; School of Science, RMIT University, Melbourne, Victoria, Australia.
| |
Collapse
|
6
|
Kim S, Park H, Kang M, Jin KH, Adeli E, Pohl KM, Park SH. Federated learning with knowledge distillation for multi-organ segmentation with partially labeled datasets. Med Image Anal 2024; 95:103156. [PMID: 38603844 DOI: 10.1016/j.media.2024.103156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 03/11/2024] [Accepted: 03/20/2024] [Indexed: 04/13/2024]
Abstract
The state-of-the-art multi-organ CT segmentation relies on deep learning models, which only generalize when trained on large samples of carefully curated data. However, it is challenging to train a single model that can segment all organs and types of tumors since most large datasets are partially labeled or are acquired across multiple institutes that may differ in their acquisitions. A possible solution is Federated learning, which is often used to train models on multi-institutional datasets where the data is not shared across sites. However, predictions of federated learning can be unreliable after the model is locally updated at sites due to 'catastrophic forgetting'. Here, we address this issue by using knowledge distillation (KD) so that the local training is regularized with the knowledge of a global model and pre-trained organ-specific segmentation models. We implement the models in a multi-head U-Net architecture that learns a shared embedding space for different organ segmentation, thereby obtaining multi-organ predictions without repeated processes. We evaluate the proposed method using 8 publicly available abdominal CT datasets of 7 different organs. Of those datasets, 889 CTs were used for training, 233 for internal testing, and 30 volumes for external testing. Experimental results verified that our proposed method substantially outperforms other state-of-the-art methods in terms of accuracy, inference time, and the number of parameters.
Collapse
Affiliation(s)
- Soopil Kim
- Department of Robotics and Mechatronics Engineering, Daegu Gyeongbuk Institute of Science and Technology, Republic of Korea; Department of Psychiatry and Behavioral Sciences, Stanford University, CA 94305, USA
| | - Heejung Park
- Department of Robotics and Mechatronics Engineering, Daegu Gyeongbuk Institute of Science and Technology, Republic of Korea
| | - Myeongkyun Kang
- Department of Robotics and Mechatronics Engineering, Daegu Gyeongbuk Institute of Science and Technology, Republic of Korea; Department of Psychiatry and Behavioral Sciences, Stanford University, CA 94305, USA
| | - Kyong Hwan Jin
- School of Electrical Engineering, Korea University, Republic of Korea
| | - Ehsan Adeli
- Department of Psychiatry and Behavioral Sciences, Stanford University, CA 94305, USA
| | - Kilian M Pohl
- Department of Psychiatry and Behavioral Sciences, Stanford University, CA 94305, USA
| | - Sang Hyun Park
- Department of Robotics and Mechatronics Engineering, Daegu Gyeongbuk Institute of Science and Technology, Republic of Korea.
| |
Collapse
|
7
|
Liu H, Xu Z, Gao R, Li H, Wang J, Chabin G, Oguz I, Grbic S. COSST: Multi-Organ Segmentation With Partially Labeled Datasets Using Comprehensive Supervisions and Self-Training. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:1995-2009. [PMID: 38224508 DOI: 10.1109/tmi.2024.3354673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/17/2024]
Abstract
Deep learning models have demonstrated remarkable success in multi-organ segmentation but typically require large-scale datasets with all organs of interest annotated. However, medical image datasets are often low in sample size and only partially labeled, i.e., only a subset of organs are annotated. Therefore, it is crucial to investigate how to learn a unified model on the available partially labeled datasets to leverage their synergistic potential. In this paper, we systematically investigate the partial-label segmentation problem with theoretical and empirical analyses on the prior techniques. We revisit the problem from a perspective of partial label supervision signals and identify two signals derived from ground truth and one from pseudo labels. We propose a novel two-stage framework termed COSST, which effectively and efficiently integrates comprehensive supervision signals with self-training. Concretely, we first train an initial unified model using two ground truth-based signals and then iteratively incorporate the pseudo label signal to the initial model using self-training. To mitigate performance degradation caused by unreliable pseudo labels, we assess the reliability of pseudo labels via outlier detection in latent space and exclude the most unreliable pseudo labels from each self-training iteration. Extensive experiments are conducted on one public and three private partial-label segmentation tasks over 12 CT datasets. Experimental results show that our proposed COSST achieves significant improvement over the baseline method, i.e., individual networks trained on each partially labeled dataset. Compared to the state-of-the-art partial-label segmentation methods, COSST demonstrates consistent superior performance on various segmentation tasks and with different training data sizes.
Collapse
|
8
|
Huang Y, Yang J, Sun Q, Yuan Y, Li H, Hou Y. Multi-residual 2D network integrating spatial correlation for whole heart segmentation. Comput Biol Med 2024; 172:108261. [PMID: 38508056 DOI: 10.1016/j.compbiomed.2024.108261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Revised: 02/21/2024] [Accepted: 03/06/2024] [Indexed: 03/22/2024]
Abstract
Whole heart segmentation (WHS) has significant clinical value for cardiac anatomy, modeling, and analysis of cardiac function. This study aims to address the WHS accuracy on cardiac CT images, as well as the fast inference speed and low graphics processing unit (GPU) memory consumption required by practical clinical applications. Thus, we propose a multi-residual two-dimensional (2D) network integrating spatial correlation for WHS. The network performs slice-by-slice segmentation on three-dimensional cardiac CT images in a 2D encoder-decoder manner. In the network, a convolutional long short-term memory skip connection module is designed to perform spatial correlation feature extraction on the feature maps at different resolutions extracted by the sub-modules of the pre-trained ResNet-based encoder. Moreover, a decoder based on the multi-residual module is designed to analyze the extracted features from the perspectives of multi-scale and channel attention, thereby accurately delineating the various substructures of the heart. The proposed method is verified on a dataset of the multi-modality WHS challenge, an in-house WHS dataset, and a dataset of the abdominal organ segmentation challenge. The dice, Jaccard, average symmetric surface distance, Hausdorff distance, inference time, and maximum GPU memory of the WHS are 0.914, 0.843, 1.066 mm, 15.778 mm, 9.535 s, and 1905 MB, respectively. The proposed network has high accuracy, fast inference speed, minimal GPU memory consumption, strong robustness, and good generalization. It can be deployed to clinical practical applications for WHS and can be effectively extended and applied to other multi-organ segmentation fields. The source code is publicly available at https://github.com/nancy1984yan/MultiResNet-SC.
Collapse
Affiliation(s)
- Yan Huang
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China; School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China
| | - Jinzhu Yang
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China; School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China; National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Shenyang, Liaoning, China.
| | - Qi Sun
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China; School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China
| | - Yuliang Yuan
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China; School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China
| | - Honghe Li
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China; School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China
| | - Yang Hou
- Department of Radiology, Shengjing Hospital of China Medical University, Shenyang, Liaoning, China
| |
Collapse
|
9
|
Xing Z, Zhu L, Yu L, Xing Z, Wan L. Hybrid Masked Image Modeling for 3D Medical Image Segmentation. IEEE J Biomed Health Inform 2024; 28:2115-2125. [PMID: 38289846 DOI: 10.1109/jbhi.2024.3360239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Masked image modeling (MIM) with transformer backbones has recently been exploited as a powerful self-supervised pre-training technique. The existing MIM methods adopt the strategy to mask random patches of the image and reconstruct the missing pixels, which only considers semantic information at a lower level, and causes a long pre-training time. This paper presents HybridMIM, a novel hybrid self-supervised learning method based on masked image modeling for 3D medical image segmentation. Specifically, we design a two-level masking hierarchy to specify which and how patches in sub-volumes are masked, effectively providing the constraints of higher level semantic information. Then we learn the semantic information of medical images at three levels, including: 1) partial region prediction to reconstruct key contents of the 3D image, which largely reduces the pre-training time burden (pixel-level); 2) patch-masking perception to learn the spatial relationship between the patches in each sub-volume (region-level); and 3) drop-out-based contrastive learning between samples within a mini-batch, which further improves the generalization ability of the framework (sample-level). The proposed framework is versatile to support both CNN and transformer as encoder backbones, and also enables to pre-train decoders for image segmentation. We conduct comprehensive experiments on five widely-used public medical image segmentation datasets, including BraTS2020, BTCV, MSD Liver, MSD Spleen, and BraTS2023. The experimental results show the clear superiority of HybridMIM against competing supervised methods, masked pre-training approaches, and other self-supervised methods, in terms of quantitative metrics, speed performance and qualitative observations.
Collapse
|
10
|
Qian B, Chen H, Wang X, Guan Z, Li T, Jin Y, Wu Y, Wen Y, Che H, Kwon G, Kim J, Choi S, Shin S, Krause F, Unterdechler M, Hou J, Feng R, Li Y, El Habib Daho M, Yang D, Wu Q, Zhang P, Yang X, Cai Y, Tan GSW, Cheung CY, Jia W, Li H, Tham YC, Wong TY, Sheng B. DRAC 2022: A public benchmark for diabetic retinopathy analysis on ultra-wide optical coherence tomography angiography images. PATTERNS (NEW YORK, N.Y.) 2024; 5:100929. [PMID: 38487802 PMCID: PMC10935505 DOI: 10.1016/j.patter.2024.100929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 12/09/2023] [Accepted: 01/15/2024] [Indexed: 03/17/2024]
Abstract
We described a challenge named "DRAC - Diabetic Retinopathy Analysis Challenge" in conjunction with the 25th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2022). Within this challenge, we provided the DRAC datset, an ultra-wide optical coherence tomography angiography (UW-OCTA) dataset (1,103 images), addressing three primary clinical tasks: diabetic retinopathy (DR) lesion segmentation, image quality assessment, and DR grading. The scientific community responded positively to the challenge, with 11, 12, and 13 teams submitting different solutions for these three tasks, respectively. This paper presents a concise summary and analysis of the top-performing solutions and results across all challenge tasks. These solutions could provide practical guidance for developing accurate classification and segmentation models for image quality assessment and DR diagnosis using UW-OCTA images, potentially improving the diagnostic capabilities of healthcare professionals. The dataset has been released to support the development of computer-aided diagnostic systems for DR evaluation.
Collapse
Affiliation(s)
- Bo Qian
- Shanghai Belt and Road International Joint Laboratory for Intelligent Prevention and Treatment of Metabolic Disorders, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China
- MOE Key Laboratory of AI, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Hao Chen
- Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong 999077, China
- Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Hong Kong 999077, China
| | - Xiangning Wang
- Shanghai Belt and Road International Joint Laboratory for Intelligent Prevention and Treatment of Metabolic Disorders, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China
- Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai 200233, China
| | - Zhouyu Guan
- Shanghai Belt and Road International Joint Laboratory for Intelligent Prevention and Treatment of Metabolic Disorders, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China
| | - Tingyao Li
- Shanghai Belt and Road International Joint Laboratory for Intelligent Prevention and Treatment of Metabolic Disorders, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China
- MOE Key Laboratory of AI, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yixiao Jin
- Tsinghua Medicine, Tsinghua University, Beijing 100084, China
| | - Yilan Wu
- Tsinghua Medicine, Tsinghua University, Beijing 100084, China
| | - Yang Wen
- School of Electronic and Information Engineering, Shenzhen University, Shenzhen 518060, China
| | - Haoxuan Che
- Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong 999077, China
| | | | | | - Sungjin Choi
- AI/DX Convergence Business Group, KT, Seongnam 13606, Korea
| | - Seoyoung Shin
- AI/DX Convergence Business Group, KT, Seongnam 13606, Korea
| | - Felix Krause
- Johannes Kepler University Linz, Linz 4040, Austria
| | | | - Junlin Hou
- School of Computer Science, Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai 200433, China
| | - Rui Feng
- School of Computer Science, Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai 200433, China
- Academy for Engineering and Technology, Fudan University, Shanghai 200433, China
| | - Yihao Li
- LaTIM UMR 1101, INSERM, 29609 Brest, France
- University of Western Brittany, 29238 Brest, France
| | - Mostafa El Habib Daho
- LaTIM UMR 1101, INSERM, 29609 Brest, France
- University of Western Brittany, 29238 Brest, France
| | - Dawei Yang
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong 999077, China
| | - Qiang Wu
- Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai 200233, China
| | - Ping Zhang
- Department of Computer Science and Engineering, The Ohio State University, Columbus, OH 43210, USA
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
- Translational Data Analytics Institute, The Ohio State University, Columbus, OH 43210, USA
| | - Xiaokang Yang
- MOE Key Laboratory of AI, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yiyu Cai
- School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore 639798, Singapore
| | - Gavin Siew Wei Tan
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore 168751, Singapore
| | - Carol Y. Cheung
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong 999077, China
| | - Weiping Jia
- Shanghai Belt and Road International Joint Laboratory for Intelligent Prevention and Treatment of Metabolic Disorders, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China
| | - Huating Li
- Shanghai Belt and Road International Joint Laboratory for Intelligent Prevention and Treatment of Metabolic Disorders, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China
| | - Yih Chung Tham
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore 168751, Singapore
- Centre for Innovation and Precision Eye Health; and Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 119228, Singapore
- Ophthalmology and Visual Sciences Academic Clinical Program, Duke-NUS Medical School, Singapore 169857, Singapore
| | - Tien Yin Wong
- Tsinghua Medicine, Tsinghua University, Beijing 100084, China
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore 168751, Singapore
- School of Clinical Medicine, Beijing Tsinghua Changgung Hospital, Beijing 102218, China
| | - Bin Sheng
- Shanghai Belt and Road International Joint Laboratory for Intelligent Prevention and Treatment of Metabolic Disorders, Department of Computer Science and Engineering, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai 200240, China
- MOE Key Laboratory of AI, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| |
Collapse
|
11
|
Adiga V S, Dolz J, Lombaert H. Anatomically-aware uncertainty for semi-supervised image segmentation. Med Image Anal 2024; 91:103011. [PMID: 37924752 DOI: 10.1016/j.media.2023.103011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Revised: 08/11/2023] [Accepted: 10/18/2023] [Indexed: 11/06/2023]
Abstract
Semi-supervised learning relaxes the need of large pixel-wise labeled datasets for image segmentation by leveraging unlabeled data. A prominent way to exploit unlabeled data is to regularize model predictions. Since the predictions of unlabeled data can be unreliable, uncertainty-aware schemes are typically employed to gradually learn from meaningful and reliable predictions. Uncertainty estimation methods, however, rely on multiple inferences from the model predictions that must be computed for each training step, which is computationally expensive. Moreover, these uncertainty maps capture pixel-wise disparities and do not consider global information. This work proposes a novel method to estimate segmentation uncertainty by leveraging global information from the segmentation masks. More precisely, an anatomically-aware representation is first learnt to model the available segmentation masks. The learnt representation thereupon maps the prediction of a new segmentation into an anatomically-plausible segmentation. The deviation from the plausible segmentation aids in estimating the underlying pixel-level uncertainty in order to further guide the segmentation network. The proposed method consequently estimates the uncertainty using a single inference from our representation, thereby reducing the total computation. We evaluate our method on two publicly available segmentation datasets of left atria in cardiac MRIs and of multiple organs in abdominal CTs. Our anatomically-aware method improves the segmentation accuracy over the state-of-the-art semi-supervised methods in terms of two commonly used evaluation metrics.
Collapse
Affiliation(s)
- Sukesh Adiga V
- Computer and Software Engineering Department, ETS Montreal, 1100 Notre Dame St. W., Montreal QC, H3C 1K3, Canada.
| | - Jose Dolz
- Computer and Software Engineering Department, ETS Montreal, 1100 Notre Dame St. W., Montreal QC, H3C 1K3, Canada
| | - Herve Lombaert
- Computer and Software Engineering Department, ETS Montreal, 1100 Notre Dame St. W., Montreal QC, H3C 1K3, Canada
| |
Collapse
|
12
|
Zhang S, Metaxas D. On the challenges and perspectives of foundation models for medical image analysis. Med Image Anal 2024; 91:102996. [PMID: 37857067 DOI: 10.1016/j.media.2023.102996] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 09/24/2023] [Accepted: 10/04/2023] [Indexed: 10/21/2023]
Abstract
This article discusses the opportunities, applications and future directions of large-scale pretrained models, i.e., foundation models, which promise to significantly improve the analysis of medical images. Medical foundation models have immense potential in solving a wide range of downstream tasks, as they can help to accelerate the development of accurate and robust models, reduce the dependence on large amounts of labeled data, preserve the privacy and confidentiality of patient data. Specifically, we illustrate the "spectrum" of medical foundation models, ranging from general imaging models, modality-specific models, to organ/task-specific models, and highlight their challenges, opportunities and applications. We also discuss how foundation models can be leveraged in downstream medical tasks to enhance the accuracy and efficiency of medical image analysis, leading to more precise diagnosis and treatment decisions.
Collapse
Affiliation(s)
- Shaoting Zhang
- University of Electronic Science and Technology of China, Chengdu, Sichuan, China; Shanghai Artificial Intelligence Laboratory, Shanghai, China.
| | | |
Collapse
|
13
|
Marchant T, Price G, McWilliam A, Henderson E, McSweeney D, van Herk M, Banfill K, Schmitt M, King J, Barker C, Faivre-Finn C. Assessment of heart-substructures auto-contouring accuracy for application in heart-sparing radiotherapy for lung cancer. BJR Open 2024; 6:tzae006. [PMID: 38737623 PMCID: PMC11087931 DOI: 10.1093/bjro/tzae006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 12/14/2023] [Accepted: 02/14/2024] [Indexed: 05/14/2024] Open
Abstract
Objectives We validated an auto-contouring algorithm for heart substructures in lung cancer patients, aiming to establish its accuracy and reliability for radiotherapy (RT) planning. We focus on contouring an amalgamated set of subregions in the base of the heart considered to be a new organ at risk, the cardiac avoidance area (CAA), to enable maximum dose limit implementation in lung RT planning. Methods The study validates a deep-learning model specifically adapted for auto-contouring the CAA (which includes the right atrium, aortic valve root, and proximal segments of the left and right coronary arteries). Geometric, dosimetric, quantitative, and qualitative validation measures are reported. Comparison with manual contours, including assessment of interobserver variability, and robustness testing over 198 cases are also conducted. Results Geometric validation shows that auto-contouring performance lies within the expected range of manual observer variability despite being slightly poorer than the average of manual observers (mean surface distance for CAA of 1.6 vs 1.2 mm, dice similarity coefficient of 0.86 vs 0.88). Dosimetric validation demonstrates consistency between plans optimized using auto-contours and manual contours. Robustness testing confirms acceptable contours in all cases, with 80% rated as "Good" and the remaining 20% as "Useful." Conclusions The auto-contouring algorithm for heart substructures in lung cancer patients demonstrates acceptable and comparable performance to human observers. Advances in knowledge Accurate and reliable auto-contouring results for the CAA facilitate the implementation of a maximum dose limit to this region in lung RT planning, which has now been introduced in the routine setting at our institution.
Collapse
Affiliation(s)
- Tom Marchant
- Christie Medical Physics & Engineering, The Christie NHS Foundation Trust, Manchester, M20 4BX, United Kingdom
- Division of Cancer Sciences, The University of Manchester, Manchester, M13 9PL, United Kingdom
| | - Gareth Price
- Division of Cancer Sciences, The University of Manchester, Manchester, M13 9PL, United Kingdom
- Radiotherapy Related Research, The Christie NHS Foundation Trust, Manchester, M20 4BX, United Kingdom
| | - Alan McWilliam
- Division of Cancer Sciences, The University of Manchester, Manchester, M13 9PL, United Kingdom
- Radiotherapy Related Research, The Christie NHS Foundation Trust, Manchester, M20 4BX, United Kingdom
| | - Edward Henderson
- Division of Cancer Sciences, The University of Manchester, Manchester, M13 9PL, United Kingdom
- Radiotherapy Related Research, The Christie NHS Foundation Trust, Manchester, M20 4BX, United Kingdom
| | - Dónal McSweeney
- Division of Cancer Sciences, The University of Manchester, Manchester, M13 9PL, United Kingdom
- Radiotherapy Related Research, The Christie NHS Foundation Trust, Manchester, M20 4BX, United Kingdom
| | - Marcel van Herk
- Division of Cancer Sciences, The University of Manchester, Manchester, M13 9PL, United Kingdom
- Radiotherapy Related Research, The Christie NHS Foundation Trust, Manchester, M20 4BX, United Kingdom
| | - Kathryn Banfill
- Division of Cancer Sciences, The University of Manchester, Manchester, M13 9PL, United Kingdom
- Department of Clinical Oncology, The Christie NHS Foundation Trust, Manchester, M20 4BX, United Kingdom
| | - Matthias Schmitt
- Division of Cardiovascular Sciences, The University of Manchester, Manchester, M13 9PL, United Kingdom
- Department of Cardiology, Manchester University NHS Foundation Trust, Manchester, M13 9WL, United Kingdom
| | - Jennifer King
- Department of Clinical Oncology, The Christie NHS Foundation Trust, Manchester, M20 4BX, United Kingdom
| | - Claire Barker
- Department of Clinical Oncology, The Christie NHS Foundation Trust, Manchester, M20 4BX, United Kingdom
| | - Corinne Faivre-Finn
- Division of Cancer Sciences, The University of Manchester, Manchester, M13 9PL, United Kingdom
- Department of Clinical Oncology, The Christie NHS Foundation Trust, Manchester, M20 4BX, United Kingdom
| |
Collapse
|
14
|
Zhang M, Wu Y, Zhang H, Qin Y, Zheng H, Tang W, Arnold C, Pei C, Yu P, Nan Y, Yang G, Walsh S, Marshall DC, Komorowski M, Wang P, Guo D, Jin D, Wu Y, Zhao S, Chang R, Zhang B, Lu X, Qayyum A, Mazher M, Su Q, Wu Y, Liu Y, Zhu Y, Yang J, Pakzad A, Rangelov B, Estepar RSJ, Espinosa CC, Sun J, Yang GZ, Gu Y. Multi-site, Multi-domain Airway Tree Modeling. Med Image Anal 2023; 90:102957. [PMID: 37716199 DOI: 10.1016/j.media.2023.102957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 06/07/2023] [Accepted: 09/04/2023] [Indexed: 09/18/2023]
Abstract
Open international challenges are becoming the de facto standard for assessing computer vision and image analysis algorithms. In recent years, new methods have extended the reach of pulmonary airway segmentation that is closer to the limit of image resolution. Since EXACT'09 pulmonary airway segmentation, limited effort has been directed to the quantitative comparison of newly emerged algorithms driven by the maturity of deep learning based approaches and extensive clinical efforts for resolving finer details of distal airways for early intervention of pulmonary diseases. Thus far, public annotated datasets are extremely limited, hindering the development of data-driven methods and detailed performance evaluation of new algorithms. To provide a benchmark for the medical imaging community, we organized the Multi-site, Multi-domain Airway Tree Modeling (ATM'22), which was held as an official challenge event during the MICCAI 2022 conference. ATM'22 provides large-scale CT scans with detailed pulmonary airway annotation, including 500 CT scans (300 for training, 50 for validation, and 150 for testing). The dataset was collected from different sites and it further included a portion of noisy COVID-19 CTs with ground-glass opacity and consolidation. Twenty-three teams participated in the entire phase of the challenge and the algorithms for the top ten teams are reviewed in this paper. Both quantitative and qualitative results revealed that deep learning models embedded with the topological continuity enhancement achieved superior performance in general. ATM'22 challenge holds as an open-call design, the training data and the gold standard evaluation are available upon successful registration via its homepage (https://atm22.grand-challenge.org/).
Collapse
Affiliation(s)
- Minghui Zhang
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China; Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, 200240, China; Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yangqian Wu
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China; Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, 200240, China; Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Hanxiao Zhang
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yulei Qin
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Hao Zheng
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Wen Tang
- InferVision Medical Technology Co., Ltd., Beijing, China
| | | | - Chenhao Pei
- InferVision Medical Technology Co., Ltd., Beijing, China
| | - Pengxin Yu
- InferVision Medical Technology Co., Ltd., Beijing, China
| | - Yang Nan
- Imperial College London, London, UK
| | | | | | | | | | - Puyang Wang
- Alibaba DAMO Academy, 969 West Wen Yi Road, Hangzhou, Zhejiang, China
| | - Dazhou Guo
- Alibaba DAMO Academy USA, 860 Washington Street, 8F, NY, USA
| | - Dakai Jin
- Alibaba DAMO Academy USA, 860 Washington Street, 8F, NY, USA
| | - Ya'nan Wu
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Shuiqing Zhao
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Runsheng Chang
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Boyu Zhang
- A.I R&D Center, Sanmed Biotech Inc., No. 266 Tongchang Road, Xiangzhou District, Zhuhai, Guangdong, China
| | - Xing Lu
- A.I R&D Center, Sanmed Biotech Inc., T220 Trade st. SanDiego, CA, USA
| | - Abdul Qayyum
- ENIB, UMR CNRS 6285 LabSTICC, Brest, 29238, France
| | - Moona Mazher
- Department of Computer Engineering and Mathematics, University Rovira I Virgili, Tarragona, Spain
| | - Qi Su
- Shanghai Jiao Tong University, Shanghai, China
| | - Yonghuang Wu
- School of Information Science and Technology, Fudan University, Shanghai, China
| | - Ying'ao Liu
- University of Science and Technology of China, Hefei, Anhui, China
| | | | - Jiancheng Yang
- Dianei Technology, Shanghai, China; EPFL, Lausanne, Switzerland
| | - Ashkan Pakzad
- Medical Physics and Biomedical Engineering Department, University College London, London, UK
| | - Bojidar Rangelov
- Center for Medical Image Computing, University College London, London, UK
| | | | | | - Jiayuan Sun
- Department of Respiratory and Critical Care Medicine, Department of Respiratory Endoscopy, Shanghai Chest Hospital, Shanghai, China.
| | - Guang-Zhong Yang
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China.
| | - Yun Gu
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China; Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, 200240, China; Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China.
| |
Collapse
|
15
|
Wu Y, Zhao S, Qi S, Feng J, Pang H, Chang R, Bai L, Li M, Xia S, Qian W, Ren H. Two-stage contextual transformer-based convolutional neural network for airway extraction from CT images. Artif Intell Med 2023; 143:102637. [PMID: 37673569 DOI: 10.1016/j.artmed.2023.102637] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2023] [Revised: 06/14/2023] [Accepted: 08/11/2023] [Indexed: 09/08/2023]
Abstract
Accurate airway segmentation from computed tomography (CT) images is critical for planning navigation bronchoscopy and realizing a quantitative assessment of airway-related chronic obstructive pulmonary disease (COPD). Existing methods face difficulty in airway segmentation, particularly for the small branches of the airway. These difficulties arise due to the constraints of limited labeling and failure to meet clinical use requirements in COPD. We propose a two-stage framework with a novel 3D contextual transformer for segmenting the overall airway and small airway branches using CT images. The method consists of two training stages sharing the same modified 3D U-Net network. The novel 3D contextual transformer block is integrated into both the encoder and decoder path of the network to effectively capture contextual and long-range information. In the first training stage, the proposed network segments the overall airway with the overall airway mask. To improve the performance of the segmentation result, we generate the intrapulmonary airway branch label, and train the network to focus on producing small airway branches in the second training stage. Extensive experiments were performed on in-house and multiple public datasets. Quantitative and qualitative analyses demonstrate that our proposed method extracts significantly more branches and longer lengths of the airway tree while accomplishing state-of-the-art airway segmentation performance. The code is available at https://github.com/zhaozsq/airway_segmentation.
Collapse
Affiliation(s)
- Yanan Wu
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China; Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong, China.
| | - Shuiqing Zhao
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China.
| | - Shouliang Qi
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China.
| | - Jie Feng
- School of Chemical Equipment, Shenyang University of Technology, Liaoyang, China.
| | - Haowen Pang
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China.
| | - Runsheng Chang
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China.
| | - Long Bai
- Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong, China.
| | - Mengqi Li
- Department of Respiratory, the Second Affiliated Hospital of Dalian Medical University, Dalian, China.
| | - Shuyue Xia
- Respiratory Department, Central Hospital Affiliated to Shenyang Medical College, Shenyang, China.
| | - Wei Qian
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China.
| | - Hongliang Ren
- Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong, China.
| |
Collapse
|
16
|
Dan Y, Jin W, Wang Z, Sun C. Optimization of U-shaped pure transformer medical image segmentation network. PeerJ Comput Sci 2023; 9:e1515. [PMID: 37705654 PMCID: PMC10495965 DOI: 10.7717/peerj-cs.1515] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 07/13/2023] [Indexed: 09/15/2023]
Abstract
In recent years, neural networks have made pioneering achievements in the field of medical imaging. In particular, deep neural networks based on U-shaped structures are widely used in different medical image segmentation tasks. In order to improve the early diagnosis and clinical decision-making system of lung diseases, it has become a key step to use the neural network for lung segmentation to assist in positioning and observing the shape. There is still the problem of low precision. For the sake of achieving better segmentation accuracy, an optimized pure Transformer U-shaped segmentation is proposed in this article. The optimization segmentation network adopts the method of adding skip connections and performing special splicing processing, which reduces the information loss in the encoding process and increases the information in the decoding process, so as to achieve the purpose of improving the segmentation accuracy. The final experiment shows that our improved network achieves 97.86% accuracy in segmentation of the "Chest Xray Masks and Labels" dataset, which is better than the full convolutional network or the combination of Transformer and convolution.
Collapse
Affiliation(s)
- Yongping Dan
- School of Electronic and Information, Zhongyuan University of Technology, Zhengzhou, Henan, China
| | - Weishou Jin
- School of Electronic and Information, Zhongyuan University of Technology, Zhengzhou, Henan, China
| | - Zhida Wang
- School of Electronic and Information, Zhongyuan University of Technology, Zhengzhou, Henan, China
| | - Changhao Sun
- School of Electronic and Information, Zhongyuan University of Technology, Zhengzhou, Henan, China
| |
Collapse
|
17
|
Mukherjee P, Lee S, Elton DC, Nakada SY, Pickhardt PJ, Summers RM. Fully Automated Longitudinal Assessment of Renal Stone Burden on Serial CT Imaging Using Deep Learning. J Endourol 2023; 37:948-955. [PMID: 37310890 PMCID: PMC10387157 DOI: 10.1089/end.2023.0066] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023] Open
Abstract
Purpose: Use deep learning (DL) to automate the measurement and tracking of kidney stone burden over serial CT scans. Materials and Methods: This retrospective study included 259 scans from 113 symptomatic patients being treated for urolithiasis at a single medical center between 2006 and 2019. These patients underwent a standard low-dose noncontrast CT scan followed by ultra-low-dose CT scans limited to the level of the kidneys. A DL model was used to detect, segment, and measure the volume of all stones in both initial and follow-up scans. The stone burden was characterized by the total volume of all stones in a scan (SV). The absolute and relative change of SV, (SVA and SVR, respectively) over serial scans were computed. The automated assessments were compared with manual assessments using concordance correlation coefficient (CCC), and their agreement was visualized using Bland-Altman and scatter plots. Results: Two hundred twenty-eight out of 233 scans with stones were identified by the automated pipeline; per-scan sensitivity was 97.8% (95% confidence interval [CI]: 96.0-99.7). The per-scan positive predictive value was 96.6% (95% CI: 94.4-98.8). The median SV, SVA, and SVR were 476.5 mm3, -10 mm3, and 0.89, respectively. After removing outliers outside the 5th and 95th percentiles, the CCC measuring agreement on SV, SVA, and SVR were 0.995 (0.992-0.996), 0.980 (0.972-0.986), and 0.915 (0.881-0.939), respectively Conclusions: The automated DL-based measurements showed good agreement with the manual assessments of the stone burden and its interval change on serial CT scans.
Collapse
Affiliation(s)
- Pritam Mukherjee
- Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, Department of Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bethesda, Maryland, USA
| | - Sungwon Lee
- Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, Department of Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bethesda, Maryland, USA
| | - Daniel C. Elton
- Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, Department of Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bethesda, Maryland, USA
| | - Stephen Y. Nakada
- Department of Radiology, The University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA
| | - Perry J. Pickhardt
- Department of Radiology, The University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA
| | - Ronald M. Summers
- Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, Department of Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bethesda, Maryland, USA
| |
Collapse
|
18
|
Li H, Nan Y, Del Ser J, Yang G. Large-Kernel Attention for 3D Medical Image Segmentation. Cognit Comput 2023; 16:2063-2077. [PMID: 38974012 PMCID: PMC11226511 DOI: 10.1007/s12559-023-10126-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Accepted: 02/09/2023] [Indexed: 03/03/2023]
Abstract
Automated segmentation of multiple organs and tumors from 3D medical images such as magnetic resonance imaging (MRI) and computed tomography (CT) scans using deep learning methods can aid in diagnosing and treating cancer. However, organs often overlap and are complexly connected, characterized by extensive anatomical variation and low contrast. In addition, the diversity of tumor shape, location, and appearance, coupled with the dominance of background voxels, makes accurate 3D medical image segmentation difficult. In this paper, a novel 3D large-kernel (LK) attention module is proposed to address these problems to achieve accurate multi-organ segmentation and tumor segmentation. The advantages of biologically inspired self-attention and convolution are combined in the proposed LK attention module, including local contextual information, long-range dependencies, and channel adaptation. The module also decomposes the LK convolution to optimize the computational cost and can be easily incorporated into CNNs such as U-Net. Comprehensive ablation experiments demonstrated the feasibility of convolutional decomposition and explored the most efficient and effective network design. Among them, the best Mid-type 3D LK attention-based U-Net network was evaluated on CT-ORG and BraTS 2020 datasets, achieving state-of-the-art segmentation performance when compared to avant-garde CNN and Transformer-based methods for medical image segmentation. The performance improvement due to the proposed 3D LK attention module was statistically validated.
Collapse
Affiliation(s)
- Hao Li
- National Heart and Lung Institute, Faculty of Medicine, Imperial College London, London, UK
- Department of Bioengineering, Faculty of Engineering, Imperial College London, London, UK
| | - Yang Nan
- National Heart and Lung Institute, Faculty of Medicine, Imperial College London, London, UK
| | - Javier Del Ser
- TECNALIA, Basque Research & Technology Alliance (BRTA), Derio, Spain
- University of the Basque Country (UPV/EHU), Bilbao, Spain
| | - Guang Yang
- National Heart and Lung Institute, Faculty of Medicine, Imperial College London, London, UK
- Royal Brompton Hospital, London, UK
| |
Collapse
|