Yazdani E, Karamzadeh-Ziarati N, Cheshmi SS, Sadeghi M, Geramifar P, Vosoughi H, Jahromi MK, Kheradpisheh SR. Automated segmentation of lesions and organs at risk on [
68Ga]Ga-PSMA-11 PET/CT images using self-supervised learning with
Swin UNETR.
Cancer Imaging 2024;
24:30. [PMID:
38424612 PMCID:
PMC10903052 DOI:
10.1186/s40644-024-00675-x]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 02/15/2024] [Indexed: 03/02/2024] Open
Abstract
BACKGROUND
Prostate-specific membrane antigen (PSMA) PET/CT imaging is widely used for quantitative image analysis, especially in radioligand therapy (RLT) for metastatic castration-resistant prostate cancer (mCRPC). Unknown features influencing PSMA biodistribution can be explored by analyzing segmented organs at risk (OAR) and lesions. Manual segmentation is time-consuming and labor-intensive, so automated segmentation methods are desirable. Training deep-learning segmentation models is challenging due to the scarcity of high-quality annotated images. Addressing this, we developed shifted windows UNEt TRansformers (Swin UNETR) for fully automated segmentation. Within a self-supervised framework, the model's encoder was pre-trained on unlabeled data. The entire model was fine-tuned, including its decoder, using labeled data.
METHODS
In this work, 752 whole-body [68Ga]Ga-PSMA-11 PET/CT images were collected from two centers. For self-supervised model pre-training, 652 unlabeled images were employed. The remaining 100 images were manually labeled for supervised training. In the supervised training phase, 5-fold cross-validation was used with 64 images for model training and 16 for validation, from one center. For testing, 20 hold-out images, evenly distributed between two centers, were used. Image segmentation and quantification metrics were evaluated on the test set compared to the ground-truth segmentation conducted by a nuclear medicine physician.
RESULTS
The model generates high-quality OARs and lesion segmentation in lesion-positive cases, including mCRPC. The results show that self-supervised pre-training significantly improved the average dice similarity coefficient (DSC) for all classes by about 3%. Compared to nnU-Net, a well-established model in medical image segmentation, our approach outperformed with a 5% higher DSC. This improvement was attributed to our model's combined use of self-supervised pre-training and supervised fine-tuning, specifically when applied to PET/CT input. Our best model had the lowest DSC for lesions at 0.68 and the highest for liver at 0.95.
CONCLUSIONS
We developed a state-of-the-art neural network using self-supervised pre-training on whole-body [68Ga]Ga-PSMA-11 PET/CT images, followed by fine-tuning on a limited set of annotated images. The model generates high-quality OARs and lesion segmentation for PSMA image analysis. The generalizable model holds potential for various clinical applications, including enhanced RLT and patient-specific internal dosimetry.
Collapse