1
|
Erdur AC, Rusche D, Scholz D, Kiechle J, Fischer S, Llorián-Salvador Ó, Buchner JA, Nguyen MQ, Etzel L, Weidner J, Metz MC, Wiestler B, Schnabel J, Rueckert D, Combs SE, Peeken JC. Deep learning for autosegmentation for radiotherapy treatment planning: State-of-the-art and novel perspectives. Strahlenther Onkol 2025; 201:236-254. [PMID: 39105745 PMCID: PMC11839850 DOI: 10.1007/s00066-024-02262-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Accepted: 06/13/2024] [Indexed: 08/07/2024]
Abstract
The rapid development of artificial intelligence (AI) has gained importance, with many tools already entering our daily lives. The medical field of radiation oncology is also subject to this development, with AI entering all steps of the patient journey. In this review article, we summarize contemporary AI techniques and explore the clinical applications of AI-based automated segmentation models in radiotherapy planning, focusing on delineation of organs at risk (OARs), the gross tumor volume (GTV), and the clinical target volume (CTV). Emphasizing the need for precise and individualized plans, we review various commercial and freeware segmentation tools and also state-of-the-art approaches. Through our own findings and based on the literature, we demonstrate improved efficiency and consistency as well as time savings in different clinical scenarios. Despite challenges in clinical implementation such as domain shifts, the potential benefits for personalized treatment planning are substantial. The integration of mathematical tumor growth models and AI-based tumor detection further enhances the possibilities for refining target volumes. As advancements continue, the prospect of one-stop-shop segmentation and radiotherapy planning represents an exciting frontier in radiotherapy, potentially enabling fast treatment with enhanced precision and individualization.
Collapse
Affiliation(s)
- Ayhan Can Erdur
- Institute for Artificial Intelligence and Informatics in Medicine, Klinikum rechts der Isar, Technical University of Munich, Ismaninger Str., 81675, Munich, Bavaria, Germany.
- Department of Radiation Oncology, TUM School of Medicine and Health, Klinikum rechts der Isar, Technical University of Munich, Ismaninger Str., 81675, Munich, Bavaria, Germany.
| | - Daniel Rusche
- Department of Radiation Oncology, TUM School of Medicine and Health, Klinikum rechts der Isar, Technical University of Munich, Ismaninger Str., 81675, Munich, Bavaria, Germany
| | - Daniel Scholz
- Institute for Artificial Intelligence and Informatics in Medicine, Klinikum rechts der Isar, Technical University of Munich, Ismaninger Str., 81675, Munich, Bavaria, Germany
- Department of Neuroradiology, TUM School of Medicine and Health, Klinikum rechts der Isar, Technical University of Munich, Ismaninger Str., 81675, Munich, Bavaria, Germany
| | - Johannes Kiechle
- Department of Radiation Oncology, TUM School of Medicine and Health, Klinikum rechts der Isar, Technical University of Munich, Ismaninger Str., 81675, Munich, Bavaria, Germany
- Institute for Computational Imaging and AI in Medicine, Technical University of Munich, Lichtenberg Str. 2a, 85748, Garching, Bavaria, Germany
- Munich Center for Machine Learning (MCML), Technical University of Munich, Arcisstraße 21, 80333, Munich, Bavaria, Germany
- Konrad Zuse School of Excellence in Reliable AI (relAI), Technical University of Munich, Walther-von-Dyck-Straße 10, 85748, Garching, Bavaria, Germany
| | - Stefan Fischer
- Department of Radiation Oncology, TUM School of Medicine and Health, Klinikum rechts der Isar, Technical University of Munich, Ismaninger Str., 81675, Munich, Bavaria, Germany
- Institute for Computational Imaging and AI in Medicine, Technical University of Munich, Lichtenberg Str. 2a, 85748, Garching, Bavaria, Germany
- Munich Center for Machine Learning (MCML), Technical University of Munich, Arcisstraße 21, 80333, Munich, Bavaria, Germany
| | - Óscar Llorián-Salvador
- Department of Radiation Oncology, TUM School of Medicine and Health, Klinikum rechts der Isar, Technical University of Munich, Ismaninger Str., 81675, Munich, Bavaria, Germany
- Department for Bioinformatics and Computational Biology - i12, Technical University of Munich, Boltzmannstraße 3, 85748, Garching, Bavaria, Germany
- Institute of Organismic and Molecular Evolution, Johannes Gutenberg University Mainz (JGU), Hüsch-Weg 15, 55128, Mainz, Rhineland-Palatinate, Germany
| | - Josef A Buchner
- Department of Radiation Oncology, TUM School of Medicine and Health, Klinikum rechts der Isar, Technical University of Munich, Ismaninger Str., 81675, Munich, Bavaria, Germany
| | - Mai Q Nguyen
- Department of Radiation Oncology, TUM School of Medicine and Health, Klinikum rechts der Isar, Technical University of Munich, Ismaninger Str., 81675, Munich, Bavaria, Germany
| | - Lucas Etzel
- Department of Radiation Oncology, TUM School of Medicine and Health, Klinikum rechts der Isar, Technical University of Munich, Ismaninger Str., 81675, Munich, Bavaria, Germany
- Institute of Radiation Medicine (IRM), Helmholtz Zentrum, Ingolstädter Landstraße 1, 85764, Oberschleißheim, Bavaria, Germany
| | - Jonas Weidner
- Institute for Artificial Intelligence and Informatics in Medicine, Klinikum rechts der Isar, Technical University of Munich, Ismaninger Str., 81675, Munich, Bavaria, Germany
- Department of Neuroradiology, TUM School of Medicine and Health, Klinikum rechts der Isar, Technical University of Munich, Ismaninger Str., 81675, Munich, Bavaria, Germany
| | - Marie-Christin Metz
- Department of Neuroradiology, TUM School of Medicine and Health, Klinikum rechts der Isar, Technical University of Munich, Ismaninger Str., 81675, Munich, Bavaria, Germany
| | - Benedikt Wiestler
- Department of Neuroradiology, TUM School of Medicine and Health, Klinikum rechts der Isar, Technical University of Munich, Ismaninger Str., 81675, Munich, Bavaria, Germany
| | - Julia Schnabel
- Institute for Computational Imaging and AI in Medicine, Technical University of Munich, Lichtenberg Str. 2a, 85748, Garching, Bavaria, Germany
- Munich Center for Machine Learning (MCML), Technical University of Munich, Arcisstraße 21, 80333, Munich, Bavaria, Germany
- Konrad Zuse School of Excellence in Reliable AI (relAI), Technical University of Munich, Walther-von-Dyck-Straße 10, 85748, Garching, Bavaria, Germany
- Institute of Machine Learning in Biomedical Imaging, Helmholtz Munich, Ingolstädter Landstraße 1, 85764, Neuherberg, Bavaria, Germany
- School of Biomedical Engineering & Imaging Sciences, King's College London, Strand, WC2R 2LS, London, London, UK
| | - Daniel Rueckert
- Institute for Artificial Intelligence and Informatics in Medicine, Klinikum rechts der Isar, Technical University of Munich, Ismaninger Str., 81675, Munich, Bavaria, Germany
- Faculty of Engineering, Department of Computing, Imperial College London, Exhibition Rd, SW7 2BX, London, London, UK
| | - Stephanie E Combs
- Department of Radiation Oncology, TUM School of Medicine and Health, Klinikum rechts der Isar, Technical University of Munich, Ismaninger Str., 81675, Munich, Bavaria, Germany
- Institute of Radiation Medicine (IRM), Helmholtz Zentrum, Ingolstädter Landstraße 1, 85764, Oberschleißheim, Bavaria, Germany
- Partner Site Munich, German Consortium for Translational Cancer Research (DKTK), Munich, Bavaria, Germany
| | - Jan C Peeken
- Department of Radiation Oncology, TUM School of Medicine and Health, Klinikum rechts der Isar, Technical University of Munich, Ismaninger Str., 81675, Munich, Bavaria, Germany
- Institute of Radiation Medicine (IRM), Helmholtz Zentrum, Ingolstädter Landstraße 1, 85764, Oberschleißheim, Bavaria, Germany
- Partner Site Munich, German Consortium for Translational Cancer Research (DKTK), Munich, Bavaria, Germany
| |
Collapse
|
2
|
Arjmandi N, Mosleh‐Shirazi MA, Mohebbi S, Nasseri S, Mehdizadeh A, Pishevar Z, Hosseini S, Tehranizadeh AA, Momennezhad M. Evaluating the dosimetric impact of deep-learning-based auto-segmentation in prostate cancer radiotherapy: Insights into real-world clinical implementation and inter-observer variability. J Appl Clin Med Phys 2025; 26:e14569. [PMID: 39616629 PMCID: PMC11905246 DOI: 10.1002/acm2.14569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Revised: 08/14/2024] [Accepted: 10/21/2024] [Indexed: 03/14/2025] Open
Abstract
PURPOSE This study aimed to investigate the dosimetric impact of deep-learning-based auto-contouring for clinical target volume (CTV) and organs at risk (OARs) delineation in prostate cancer radiotherapy planning. Additionally, we compared the geometric accuracy of auto-contouring system to the variability observed between human experts. METHODS We evaluated 28 planning CT volumes, each with three contour sets: reference original contours (OC), auto-segmented contours (AC), and expert-defined manual contours (EC). We generated 3D-CRT and intensity-modulated radiation therapy (IMRT) plans for each contour set and compared their dosimetric characteristics using dose-volume histograms (DVHs), homogeneity index (HI), conformity index (CI), and gamma pass rate (3%/3 mm). RESULTS The geometric differences between automated contours and both their original manual reference contours and a second set of manually generated contours are smaller than the differences between two manually contoured sets for bladder, right femoral head (RFH), and left femoral head (LFH) structures. Furthermore, dose distribution accuracy using planning target volumes (PTVs) derived from automatically contoured CTVs and auto-contoured OARs demonstrated consistency with plans based on reference contours across all evaluated cases for both 3D-CRT and IMRT plans. For example, in IMRT plans, the average D95 for PTVs was 77.71 ± 0.53 Gy for EC plans, 77.58 ± 0.69 Gy for OC plans, and 77.62 ± 0.38 Gy for AC plans. Automated contouring significantly reduced contouring time, averaging 0.53 ± 0.08 min compared to 24.9 ± 4.5 min for manual delineation. CONCLUSION Our automated contouring system can reduce inter-expert variability and achieve dosimetric accuracy comparable to gold standard reference contours, highlighting its potential for streamlining clinical workflows. The quantitative analysis revealed no consistent trend of increasing or decreasing PTVs derived from automatically contoured CTVs and OAR doses due to automated contours, indicating minimal impact on treatment outcomes. These findings support the clinical feasibility of utilizing our deep-learning-based auto-contouring model for prostate cancer radiotherapy planning.
Collapse
Affiliation(s)
- Najmeh Arjmandi
- Department of Medical PhysicsFaculty of MedicineMashhad University of Medical SciencesMashhadIran
| | - Mohammad Amin Mosleh‐Shirazi
- Physics Unit, Department of Radio‐OncologyShiraz University of Medical SciencesShirazIran
- Ionizing and Non‐Ionizing Radiation Protection Research CenterSchool of Paramedical SciencesShiraz University of Medical SciencesShirazIran
| | | | - Shahrokh Nasseri
- Department of Medical PhysicsFaculty of MedicineMashhad University of Medical SciencesMashhadIran
- Medical Physics Research CenterFaculty of MedicineMashhad University of Medical SciencesMashhadIran
| | - Alireza Mehdizadeh
- Ionizing and Non‐Ionizing Radiation Protection Research CenterSchool of Paramedical SciencesShiraz University of Medical SciencesShirazIran
| | - Zohreh Pishevar
- Department of Radiation OncologyMashhad University of Medical SciencesMashhadIran
| | - Sare Hosseini
- Department of Radiation OncologyMashhad University of Medical SciencesMashhadIran
- Cancer Research CenterMashhad University of Medical SciencesMashhadIran
| | - Amin Amiri Tehranizadeh
- Department of Medical InformaticsFaculty of MedicineMashhad University of Medical SciencesMashhadIran
| | - Mehdi Momennezhad
- Department of Medical PhysicsFaculty of MedicineMashhad University of Medical SciencesMashhadIran
- Medical Physics Research CenterFaculty of MedicineMashhad University of Medical SciencesMashhadIran
| |
Collapse
|
3
|
Wu Z, Wang D, Xu C, Peng S, Deng L, Liu M, Wu Y. Clinical target volume (CTV) automatic delineation using deep learning network for cervical cancer radiotherapy: A study with external validation. J Appl Clin Med Phys 2025; 26:e14553. [PMID: 39401180 PMCID: PMC11712972 DOI: 10.1002/acm2.14553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Revised: 06/06/2024] [Accepted: 08/04/2024] [Indexed: 01/11/2025] Open
Abstract
PURPOSE To explore the accuracy and feasibility of a proposed deep learning (DL) algorithm for clinical target volume (CTV) delineation in cervical cancer radiotherapy and evaluate whether it can perform well in external cervical cancer and endometrial cancer cases for generalization validation. METHODS A total of 332 patients were enrolled in this study. A state-of-the-art network called ResCANet, which added the cascade multi-scale convolution in the skip connections to eliminate semantic differences between different feature layers based on ResNet-UNet. The atrous spatial pyramid pooling in the deepest feature layer combined the semantic information of different receptive fields without losing information. A total of 236 cervical cancer cases were randomly grouped into 5-fold cross-training (n = 189) and validation (n = 47) cohorts. External validations were performed in a separate cohort of 54 cervical cancer and 42 endometrial cancer cases. The performances of the proposed network were evaluated by dice similarity coefficient (DSC), sensitivity (SEN), positive predictive value (PPV), 95% Hausdorff distance (95HD), and oncologist clinical score when comparing them with manual delineation in validation cohorts. RESULTS In internal validation cohorts, the mean DSC, SEN, PPV, 95HD for ResCANet achieved 74.8%, 81.5%, 73.5%, and 10.5 mm. In external independent validation cohorts, ResCANet achieved 73.4%, 72.9%, 75.3%, 12.5 mm for cervical cancer cases and 77.1%, 81.1%, 75.5%, 10.3 mm for endometrial cancer cases, respectively. The clinical assessment score showed that minor and no revisions (delineation time was shortened to within 30 min) accounted for about 85% of all cases in DL-aided automatic delineation. CONCLUSIONS We demonstrated the problem of model generalizability for DL-based automatic delineation. The proposed network can improve the performance of automatic delineation for cervical cancer and shorten manual delineation time at no expense to quality. The network showed excellent clinical viability, which can also be even generalized for endometrial cancer with excellent performance.
Collapse
Affiliation(s)
- Zhe Wu
- Department of Digital MedicineSchool of Biomedical Engineering and Medical ImagingArmy Medical University (Third Military Medical University)ChongqingChina
- Yu‐Yue Pathology Research CenterJinfeng LaboratoryChongqingChina
| | - Dong Wang
- Department of Radiation OncologyZigong First People's HospitalSichuanChina
| | - Cheng Xu
- Department of RadiotherapyBeijing Luhe HospitalBeijingChina
| | - Shengxian Peng
- Department of Radiation OncologyZigong First People's HospitalSichuanChina
| | - Lihua Deng
- Department of Digital MedicineSchool of Biomedical Engineering and Medical ImagingArmy Medical University (Third Military Medical University)ChongqingChina
| | - Mujun Liu
- Department of Digital MedicineSchool of Biomedical Engineering and Medical ImagingArmy Medical University (Third Military Medical University)ChongqingChina
| | - Yi Wu
- Department of Digital MedicineSchool of Biomedical Engineering and Medical ImagingArmy Medical University (Third Military Medical University)ChongqingChina
- Yu‐Yue Pathology Research CenterJinfeng LaboratoryChongqingChina
| |
Collapse
|
4
|
Duan J, Tegtmeier RC, Vargas CE, Yu NY, Laughlin BS, Rwigema JCM, Anderson JD, Zhu L, Chen Q, Rong Y. Achieving accurate prostate auto-segmentation on CT in the absence of MR imaging. Radiother Oncol 2025; 202:110588. [PMID: 39419353 DOI: 10.1016/j.radonc.2024.110588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2024] [Revised: 10/03/2024] [Accepted: 10/14/2024] [Indexed: 10/19/2024]
Abstract
BACKGROUND Magnetic resonance imaging (MRI) is considered the gold standard for prostate segmentation. Computed tomography (CT)-based segmentation is prone to observer bias, potentially overestimating the prostate volume by ∼ 30 % compared to MRI. However, MRI accessibility is challenging for patients with contraindications or in rural areas globally with limited clinical resources. PURPOSE This study investigates the possibility of achieving MRI-level prostate auto-segmentation accuracy using CT-only input via a deep learning (DL) model trained with CT-MRI registered segmentation. METHODS AND MATERIALS A cohort of 111 definitive prostate radiotherapy patients with both CT and MRI images was retrospectively grouped into training (n = 37) and validation (n = 20) (where reference contours were derived from CT-MRI registration), and testing (n = 54) sets. Two commercial DL models were benchmarked against the reference contours in the training and validation sets. A custom DL model was incrementally retrained using the training dataset, quantitatively evaluated on the validation dataset, and qualitatively assessed by two different physician groups on the validation and testing datasets. A contour quality assurance (QA) model, established from the proposed model on the validation dataset, was applied to the test group to identify potential errors, confirmed by human visual inspection. RESULTS Two commercial models exhibited large deviations in the prostate apex with CT-only input (median: 0.77/0.78 for Dice similarity coefficient (DSC), and 0.80 cm/0.83 cm for 95 % directed Hausdorff Distance (HD95), respectively). The proposed model demonstrated superior geometric similarity compared to commercial models, particularly in the apex region, with improvements of 0.05/0.17 cm and 0.06/0.25 cm in median DSC/HD95, respectively. Physician evaluation on MRI-CT registration data rated 69 %-78 % of the proposed model's contours as clinically acceptable without modifications. Additionally, 73 % of cases flagged by the contour quality assurance (QA) model were confirmed via visual inspection. CONCLUSIONS The proposed incremental learning strategy based on CT-MRI registration information enhances prostate segmentation accuracy when MRI availability is limited clinically.
Collapse
Affiliation(s)
- Jingwei Duan
- Mayo Clinic Arizona, Phoenix, AZ, United States; The University of Alabama at Birmingham, Birmingham, AL, United States
| | - Riley C Tegtmeier
- Mayo Clinic Arizona, Phoenix, AZ, United States; The University of South Florida, Tampa, FL, United States
| | | | - Nathan Y Yu
- Mayo Clinic Arizona, Phoenix, AZ, United States
| | | | | | | | - Libing Zhu
- Mayo Clinic Arizona, Phoenix, AZ, United States
| | - Quan Chen
- Mayo Clinic Arizona, Phoenix, AZ, United States.
| | - Yi Rong
- Mayo Clinic Arizona, Phoenix, AZ, United States.
| |
Collapse
|
5
|
Zhang Y, Amjad A, Ding J, Sarosiek C, Zarenia M, Conlin R, Hall WA, Erickson B, Paulson E. Comprehensive Clinical Usability-Oriented Contour Quality Evaluation for Deep Learning Auto-segmentation: Combining Multiple Quantitative Metrics Through Machine Learning. Pract Radiat Oncol 2025; 15:93-102. [PMID: 39233005 PMCID: PMC11711007 DOI: 10.1016/j.prro.2024.07.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Revised: 06/07/2024] [Accepted: 07/30/2024] [Indexed: 09/06/2024]
Abstract
PURPOSE The current commonly used metrics for evaluating the quality of auto-segmented contours have limitations and do not always reflect the clinical usefulness of the contours. This work aims to develop a novel contour quality classification (CQC) method by combining multiple quantitative metrics for clinical usability-oriented contour quality evaluation for deep learning-based auto-segmentation (DLAS). METHODS AND MATERIALS The CQC was designed to categorize contours on slices as acceptable, minor edit, or major edit based on the expected editing effort/time with supervised ensemble tree classification models using 7 quantitative metrics. Organ-specific models were trained for 5 abdominal organs (pancreas, duodenum, stomach, small, and large bowels) using 50 magnetic resonance imaging (MRI) data sets. Twenty additional MRI and 9 computed tomography (CT) data sets were employed for testing. Interobserver variation (IOV) was assessed among 6 observers and consensus labels were established through majority vote for evaluation. The CQC was also compared with a threshold-based baseline approach. RESULTS For the 5 organs, the average area under the curve was 0.982 ± 0.01 and 0.979 ± 0.01, the mean accuracy was 95.8% ± 1.7% and 94.3% ± 2.1%, and the mean risk rate was 0.8% ± 0.4% and 0.7% ± 0.5% for MRI and CT testing data set, respectively. The CQC results closely matched the IOV results (mean accuracy of 94.2% ± 0.8% and 94.8% ± 1.7%) and were significantly higher than those obtained using the threshold-based method (mean accuracy of 80.0% ± 4.7%, 83.8% ± 5.2%, and 77.3% ± 6.6% using 1, 2, and 3 metrics). CONCLUSIONS The CQC models demonstrated high performance in classifying the quality of contour slices. This method can address the limitations of existing metrics and offers an intuitive and comprehensive solution for clinically oriented evaluation and comparison of DLAS systems.
Collapse
Affiliation(s)
- Ying Zhang
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, Wisconsin; Department of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, Texas.
| | - Asma Amjad
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Jie Ding
- Department of Radiation Oncology, Emory University School of Medicine, Atlanta, Georgia
| | - Christina Sarosiek
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Mohammad Zarenia
- Department of Radiation Medicine, MedStar Georgetown University Hospital, Washington, District of Columbia
| | - Renae Conlin
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - William A Hall
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Beth Erickson
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Eric Paulson
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, Wisconsin
| |
Collapse
|
6
|
Zarenia M, Zhang Y, Sarosiek C, Conlin R, Amjad A, Paulson E. Deep learning-based automatic contour quality assurance for auto-segmented abdominal MR-Linac contours. Phys Med Biol 2024; 69:10.1088/1361-6560/ad87a6. [PMID: 39413822 PMCID: PMC11551967 DOI: 10.1088/1361-6560/ad87a6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Accepted: 10/16/2024] [Indexed: 10/18/2024]
Abstract
Objective.Deep-learning auto-segmentation (DLAS) aims to streamline contouring in clinical settings. Nevertheless, achieving clinical acceptance of DLAS remains a hurdle in abdominal MRI, hindering the implementation of efficient clinical workflows for MR-guided online adaptive radiotherapy (MRgOART). Integrating automated contour quality assurance (ACQA) with automatic contour correction (ACC) techniques could optimize the performance of ACC by concentrating on inaccurate contours. Furthermore, ACQA can facilitate the contour selection process from various DLAS tools and/or deformable contour propagation from a prior treatment session. Here, we present the performance of novel DL-based 3D ACQA models for evaluating DLAS contours acquired during MRgOART.Approach.The ACQA model, based on a 3D convolutional neural network (CNN), was trained using pancreas and duodenum contours obtained from a research DLAS tool on abdominal MRIs acquired from a 1.5 T MR-Linac. The training dataset contained abdominal MR images, DL contours, and their corresponding quality ratings, from 103 datasets. The quality of DLAS contours was determined using an in-house contour classification tool, which categorizes contours as acceptable or edit-required based on the expected editing effort. The performance of the 3D ACQA model was evaluated using an independent dataset of 34 abdominal MRIs, utilizing confusion matrices for true and predicted classes.Main results.The ACQA predicted 'acceptable' and 'edit-required' contours at 72.2% (91/126) and 83.6% (726/868) accuracy for pancreas, and at 71.2% (79/111) and 89.6% (772/862) for duodenum contours, respectively. The model successfully identified false positive (extra) and false negative (missing) DLAS contours at 93.75% (15/16) and %99.7 (438/439) accuracy for pancreas, and at 95% (57/60) and 98.9% (91/99) for duodenum, respectively.Significance.We developed 3D-ACQA models capable of quickly evaluating the quality of DLAS pancreas and duodenum contours on abdominal MRI. These models can be integrated into clinical workflow, facilitating efficient and consistent contour evaluation process in MRgOART for abdominal malignancies.
Collapse
Affiliation(s)
- Mohammad Zarenia
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, WI
- Department of Radiation Medicine, MedStar Georgetown University Hospital, Washington, D.C
| | - Ying Zhang
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, WI
- Department of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, TX
| | - Christina Sarosiek
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, WI
| | - Renae Conlin
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, WI
| | - Asma Amjad
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, WI
| | - Eric Paulson
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, WI
| |
Collapse
|
7
|
Rong Y, Chen Q, Fu Y, Yang X, Al-Hallaq HA, Wu QJ, Yuan L, Xiao Y, Cai B, Latifi K, Benedict SH, Buchsbaum JC, Qi XS. NRG Oncology Assessment of Artificial Intelligence Deep Learning-Based Auto-segmentation for Radiation Therapy: Current Developments, Clinical Considerations, and Future Directions. Int J Radiat Oncol Biol Phys 2024; 119:261-280. [PMID: 37972715 PMCID: PMC11023777 DOI: 10.1016/j.ijrobp.2023.10.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 09/16/2023] [Accepted: 10/14/2023] [Indexed: 11/19/2023]
Abstract
Deep learning neural networks (DLNN) in Artificial intelligence (AI) have been extensively explored for automatic segmentation in radiotherapy (RT). In contrast to traditional model-based methods, data-driven AI-based models for auto-segmentation have shown high accuracy in early studies in research settings and controlled environment (single institution). Vendor-provided commercial AI models are made available as part of the integrated treatment planning system (TPS) or as a stand-alone tool that provides streamlined workflow interacting with the main TPS. These commercial tools have drawn clinics' attention thanks to their significant benefit in reducing the workload from manual contouring and shortening the duration of treatment planning. However, challenges occur when applying these commercial AI-based segmentation models to diverse clinical scenarios, particularly in uncontrolled environments. Contouring nomenclature and guideline standardization has been the main task undertaken by the NRG Oncology. AI auto-segmentation holds the potential clinical trial participants to reduce interobserver variations, nomenclature non-compliance, and contouring guideline deviations. Meanwhile, trial reviewers could use AI tools to verify contour accuracy and compliance of those submitted datasets. In recognizing the growing clinical utilization and potential of these commercial AI auto-segmentation tools, NRG Oncology has formed a working group to evaluate the clinical utilization and potential of commercial AI auto-segmentation tools. The group will assess in-house and commercially available AI models, evaluation metrics, clinical challenges, and limitations, as well as future developments in addressing these challenges. General recommendations are made in terms of the implementation of these commercial AI models, as well as precautions in recognizing the challenges and limitations.
Collapse
Affiliation(s)
- Yi Rong
- Mayo Clinic Arizona, Phoenix, AZ
| | - Quan Chen
- City of Hope Comprehensive Cancer Center Duarte, CA
| | - Yabo Fu
- Memorial Sloan Kettering Cancer Center, Commack, NY
| | | | | | | | - Lulin Yuan
- Virginia Commonwealth University, Richmond, VA
| | - Ying Xiao
- University of Pennsylvania/Abramson Cancer Center, Philadelphia, PA
| | - Bin Cai
- The University of Texas Southwestern Medical Center, Dallas, TX
| | | | - Stanley H Benedict
- University of California Davis Comprehensive Cancer Center, Sacramento, CA
| | | | - X Sharon Qi
- University of California Los Angeles, Los Angeles, CA
| |
Collapse
|
8
|
Luan S, Ou-Yang J, Yang X, Wei W, Xue X, Zhu B. A multi-modal vision-language pipeline strategy for contour quality assurance and adaptive optimization. Phys Med Biol 2024; 69:065005. [PMID: 38373347 DOI: 10.1088/1361-6560/ad2a97] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 02/19/2024] [Indexed: 02/21/2024]
Abstract
Objective.Accurate delineation of organs-at-risk (OARs) is a critical step in radiotherapy. The deep learning generated segmentations usually need to be reviewed and corrected by oncologists manually, which is time-consuming and operator-dependent. Therefore, an automated quality assurance (QA) and adaptive optimization correction strategy was proposed to identify and optimize 'incorrect' auto-segmentations.Approach.A total of 586 CT images and labels from nine institutions were used. The OARs included the brainstem, parotid, and mandible. The deep learning generated contours were compared with the manual ground truth delineations. In this study, we proposed a novel contour quality assurance and adaptive optimization (CQA-AO) strategy, which consists of the following three main components: (1) the contour QA module classified the deep learning generated contours as either accepted or unaccepted; (2) the unacceptable contour categories analysis module provided the potential error reasons (five unacceptable category) and locations (attention heatmaps); (3) the adaptive correction of unacceptable contours module integrate vision-language representations and utilize convex optimization algorithms to achieve adaptive correction of 'incorrect' contours.Main results. In the contour QA tasks, the sensitivity (accuracy, precision) of CQA-AO strategy reached 0.940 (0.945, 0.948), 0.962 (0.937, 0.913), and 0.967 (0.962, 0.957) for brainstem, parotid and mandible, respectively. The unacceptable contour category analysis, the(FI,AccI,Fmicro,Fmacro)of CQA-AO strategy reached (0.901, 0.763, 0.862, 0.822), (0.855, 0.737, 0.837, 0.784), and (0.907, 0.762, 0.858, 0.821) for brainstem, parotid and mandible, respectively. After adaptive optimization correction, the DSC values of brainstem, parotid and mandible have been improved by 9.4%, 25.9%, and 13.5%, and Hausdorff distance values decreased by 62%, 70.6%, and 81.6%, respectively.Significance. The proposed CQA-AO strategy, which combines QA of contour and adaptive optimization correction for OARs contouring, demonstrated superior performance compare to conventional methods. This method can be implemented in the clinical contouring procedures and improve the efficiency of delineating and reviewing workflow.
Collapse
Affiliation(s)
- Shunyao Luan
- School of Integrated Circuits, Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, People's Republic of China
| | - Jun Ou-Yang
- School of Integrated Circuits, Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, People's Republic of China
| | - Xiaofei Yang
- School of Integrated Circuits, Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, People's Republic of China
| | - Wei Wei
- Department of Radiation Oncology, Hubei Cancer Hospital, TongJi Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, People's Republic of China
| | - Xudong Xue
- Department of Radiation Oncology, Hubei Cancer Hospital, TongJi Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, People's Republic of China
| | - Benpeng Zhu
- School of Integrated Circuits, Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, People's Republic of China
| |
Collapse
|
9
|
Podobnik G, Ibragimov B, Peterlin P, Strojan P, Vrtovec T. vOARiability: Interobserver and intermodality variability analysis in OAR contouring from head and neck CT and MR images. Med Phys 2024; 51:2175-2186. [PMID: 38230752 DOI: 10.1002/mp.16924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 10/31/2023] [Accepted: 12/13/2023] [Indexed: 01/18/2024] Open
Abstract
BACKGROUND Accurate and consistent contouring of organs-at-risk (OARs) from medical images is a key step of radiotherapy (RT) cancer treatment planning. Most contouring approaches rely on computed tomography (CT) images, but the integration of complementary magnetic resonance (MR) modality is highly recommended, especially from the perspective of OAR contouring, synthetic CT and MR image generation for MR-only RT, and MR-guided RT. Although MR has been recognized as valuable for contouring OARs in the head and neck (HaN) region, the accuracy and consistency of the resulting contours have not been yet objectively evaluated. PURPOSE To analyze the interobserver and intermodality variability in contouring OARs in the HaN region, performed by observers with different level of experience from CT and MR images of the same patients. METHODS In the final cohort of 27 CT and MR images of the same patients, contours of up to 31 OARs were obtained by a radiation oncology resident (junior observer, JO) and a board-certified radiation oncologist (senior observer, SO). The resulting contours were then evaluated in terms of interobserver variability, characterized as the agreement among different observers (JO and SO) when contouring OARs in a selected modality (CT or MR), and intermodality variability, characterized as the agreement among different modalities (CT and MR) when OARs were contoured by a selected observer (JO or SO), both by the Dice coefficient (DC) and 95-percentile Hausdorff distance (HD95 $_{95}$ ). RESULTS The mean (±standard deviation) interobserver variability was 69.0 ± 20.2% and 5.1 ± 4.1 mm, while the mean intermodality variability was 61.6 ± 19.0% and 6.1 ± 4.3 mm in terms of DC and HD95 $_{95}$ , respectively, across all OARs. Statistically significant differences were only found for specific OARs. The performed MR to CT image registration resulted in a mean target registration error of 1.7 ± 0.5 mm, which was considered as valid for the analysis of intermodality variability. CONCLUSIONS The contouring variability was, in general, similar for both image modalities, and experience did not considerably affect the contouring performance. However, the results indicate that an OAR is difficult to contour regardless of whether it is contoured in the CT or MR image, and that observer experience may be an important factor for OARs that are deemed difficult to contour. Several of the differences in the resulting variability can be also attributed to adherence to guidelines, especially for OARs with poor visibility or without distinctive boundaries in either CT or MR images. Although considerable contouring differences were observed for specific OARs, it can be concluded that almost all OARs can be contoured with a similar degree of variability in either the CT or MR modality, which works in favor of MR images from the perspective of MR-only and MR-guided RT.
Collapse
Affiliation(s)
- Gašper Podobnik
- Faculty of Electrical Engineering, University of Ljubljana, Ljubljana, Slovenia
| | - Bulat Ibragimov
- Faculty of Electrical Engineering, University of Ljubljana, Ljubljana, Slovenia
- Department of Computer Science, University of Copenhagen, Copenhagen, Denmark
| | | | | | - Tomaž Vrtovec
- Faculty of Electrical Engineering, University of Ljubljana, Ljubljana, Slovenia
| |
Collapse
|
10
|
Brooks J, Tryggestad E, Anand A, Beltran C, Foote R, Lucido JJ, Laack NN, Routman D, Patel SH, Seetamsetty S, Moseley D. Knowledge-based quality assurance of a comprehensive set of organ at risk contours for head and neck radiotherapy. Front Oncol 2024; 14:1295251. [PMID: 38487718 PMCID: PMC10937434 DOI: 10.3389/fonc.2024.1295251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 02/05/2024] [Indexed: 03/17/2024] Open
Abstract
Introduction Manual review of organ at risk (OAR) contours is crucial for creating safe radiotherapy plans but can be time-consuming and error prone. Statistical and deep learning models show the potential to automatically detect improper contours by identifying outliers using large sets of acceptable data (knowledge-based outlier detection) and may be able to assist human reviewers during review of OAR contours. Methods This study developed an automated knowledge-based outlier detection method and assessed its ability to detect erroneous contours for all common head and neck (HN) OAR types used clinically at our institution. We utilized 490 accurate CT-based HN structure sets from unique patients, each with forty-two HN OAR contours when anatomically present. The structure sets were distributed as 80% for training, 10% for validation, and 10% for testing. In addition, 190 and 37 simulated contours containing errors were added to the validation and test sets, respectively. Single-contour features, including location, shape, orientation, volume, and CT number, were used to train three single-contour feature models (z-score, Mahalanobis distance [MD], and autoencoder [AE]). Additionally, a novel contour-to-contour relationship (CCR) model was trained using the minimum distance and volumetric overlap between pairs of OAR contours to quantify overlap and separation. Inferences from single-contour feature models were combined with the CCR model inferences and inferences evaluating the number of disconnected parts in a single contour and then compared. Results In the test dataset, before combination with the CCR model, the area under the curve values were 0.922/0.939/0.939 for the z-score, MD, and AE models respectively for all contours. After combination with CCR model inferences, the z-score, MD, and AE had sensitivities of 0.838/0.892/0.865, specificities of 0.922/0.907/0.887, and balanced accuracies (BA) of 0.880/0.900/0.876 respectively. In the validation dataset, with similar overall performance and no signs of overfitting, model performance for individual OAR types was assessed. The combined AE model demonstrated minimum, median, and maximum BAs of 0.729, 0.908, and 0.980 across OAR types. Discussion Our novel knowledge-based method combines models utilizing single-contour and CCR features to effectively detect erroneous OAR contours across a comprehensive set of 42 clinically used OAR types for HN radiotherapy.
Collapse
Affiliation(s)
- Jamison Brooks
- Department of Radiation Oncology, Mayo Clinic Rochester, Rochester, MN, United States
| | - Erik Tryggestad
- Department of Radiation Oncology, Mayo Clinic Rochester, Rochester, MN, United States
| | - Aman Anand
- Department of Radiation Oncology, Mayo Clinic Arizona, Phoenix, AZ, United States
| | - Chris Beltran
- Department of Radiation Oncology, Mayo Clinic Florida, Jacksonville, FL, United States
| | - Robert Foote
- Department of Radiation Oncology, Mayo Clinic Rochester, Rochester, MN, United States
| | - J. John Lucido
- Department of Radiation Oncology, Mayo Clinic Rochester, Rochester, MN, United States
| | - Nadia N. Laack
- Department of Radiation Oncology, Mayo Clinic Rochester, Rochester, MN, United States
| | - David Routman
- Department of Radiation Oncology, Mayo Clinic Rochester, Rochester, MN, United States
| | - Samir H. Patel
- Department of Radiation Oncology, Mayo Clinic Arizona, Phoenix, AZ, United States
| | - Srinivas Seetamsetty
- Department of Radiation Oncology, Mayo Clinic Rochester, Rochester, MN, United States
| | - Douglas Moseley
- Department of Radiation Oncology, Mayo Clinic Rochester, Rochester, MN, United States
| |
Collapse
|
11
|
Hanna EM, Sargent E, Hua CH, Merchant TE, Ates O. Performance analysis and knowledge-based quality assurance of critical organ auto-segmentation for pediatric craniospinal irradiation. Sci Rep 2024; 14:4251. [PMID: 38378834 PMCID: PMC11310500 DOI: 10.1038/s41598-024-55015-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 02/19/2024] [Indexed: 02/22/2024] Open
Abstract
Craniospinal irradiation (CSI) is a vital therapeutic approach utilized for young patients suffering from central nervous system disorders such as medulloblastoma. The task of accurately outlining the treatment area is particularly time-consuming due to the presence of several sensitive organs at risk (OAR) that can be affected by radiation. This study aimed to assess two different methods for automating the segmentation process: an atlas technique and a deep learning neural network approach. Additionally, a novel method was devised to prospectively evaluate the accuracy of automated segmentation as a knowledge-based quality assurance (QA) tool. Involving a patient cohort of 100, ranging in ages from 2 to 25 years with a median age of 8, this study employed quantitative metrics centered around overlap and distance calculations to determine the most effective approach for practical clinical application. The contours generated by two distinct methods of atlas and neural network were compared to ground truth contours approved by a radiation oncologist, utilizing 13 distinct metrics. Furthermore, an innovative QA tool was conceptualized, designed for forthcoming cases based on the baseline dataset of 100 patient cases. The calculated metrics indicated that, in the majority of cases (60.58%), the neural network method demonstrated a notably higher alignment with the ground truth. Instances where no difference was observed accounted for 31.25%, while utilization of the atlas method represented 8.17%. The QA tool results showed that the two approaches achieved 100% agreement in 39.4% of instances for the atlas method and in 50.6% of instances for the neural network auto-segmentation. The results indicate that the neural network approach showcases superior performance, and its significantly closer physical alignment to ground truth contours in the majority of cases. The metrics derived from overlap and distance measurements have enabled clinicians to discern the optimal choice for practical clinical application.
Collapse
Affiliation(s)
- Emeline M Hanna
- St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
| | - Emma Sargent
- St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
| | - Chia-Ho Hua
- St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
| | | | - Ozgur Ates
- St. Jude Children's Research Hospital, Memphis, TN, 38105, USA.
| |
Collapse
|
12
|
Duan J, Bernard ME, Rong Y, Castle JR, Feng X, Johnson JD, Chen Q. Contour subregion error detection methodology using deep learning auto-segmentation. Med Phys 2023; 50:6673-6683. [PMID: 37793103 DOI: 10.1002/mp.16768] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 07/26/2023] [Accepted: 09/17/2023] [Indexed: 10/06/2023] Open
Abstract
BACKGROUND Inaccurate manual organ delineation is one of the high-risk failure modes in radiation treatment. Numerous automated contour quality assurance (QA) systems have been developed to assess contour acceptability; however, manual inspection of flagged cases is a time-consuming and challenging process, and can lead to users overlooking the exact error location. PURPOSE Our aim is to develop and validate a contour QA system that can effectively detect and visualize subregional contour errors, both qualitatively and quantitatively. METHODS/MATERIALS A novel contour subregion error detection (CSED) system was developed using subregional surface distance discrepancies between manual and deep learning auto-segmentation (DLAS) contours. A validation study was conducted using a head and neck public dataset containing 339 cases and evaluated according to knowledge-based pass criteria derived from a clinical training dataset of 60 cases. A blind qualitative evaluation was conducted, comparing the results from the CSED system with manual labels. Subsequently, the CSED-flagged cases were re-examined by a radiation oncologist. RESULTS The CSED system could visualize the diverse types of subregional contour errors qualitatively and quantitatively. In the validation dataset, the CSED system resulted in true positive rates (TPR) of 0.814, 0.800, and 0.771; false positive rates (FPR) of 0.310, 0.267, and 0.298; and accuracies of 0.735, 0.759, and 0.730, for brainstem and left and right parotid contours, respectively. The CSED-assisted manual review caught 13 brainstem, 19 left parotid, and 21 right parotid contour errors missed by conventional human review. The TPR/FPR/accuracy of the CSED-assisted manual review improved to 0.836/0.253/0.784, 0.831/0.171/0.830, and 0.808/0.193/0.807 for each structure, respectively. Further, the time savings achieved through CSED-assisted review improved by 75%, with the time for review taking 24.81 ± 12.84, 26.75 ± 10.41, and 28.71 ± 13.72 s for each structure, respectively. CONCLUSIONS The CSED system enables qualitative and quantitative detection, localization, and visualization of manual segmentation subregional errors utilizing DLAS contours as references. The use of this system has been shown to help reduce the risk of high-risk failure modes resulting from inaccurate organ segmentation.
Collapse
Affiliation(s)
- Jingwei Duan
- Department of Radiation Medicine, University of Kentucky, Lexington, Kentucky, USA
| | - Mark E Bernard
- Department of Radiation Medicine, University of Kentucky, Lexington, Kentucky, USA
| | - Yi Rong
- Department of Radiation Oncology, Mayo Clinic, Phoenix, Arizona, USA
| | - James R Castle
- Department of Radiation Medicine, University of Kentucky, Lexington, Kentucky, USA
| | - Xue Feng
- Carina Medical LLC, Lexington, Kentucky, USA
| | - Jeremiah D Johnson
- Department of Radiation Medicine, University of Kentucky, Lexington, Kentucky, USA
| | - Quan Chen
- Department of Radiation Medicine, University of Kentucky, Lexington, Kentucky, USA
- Department of Radiation Oncology, City of Hope Comprehensive Cancer Center, Duarte, California, USA
| |
Collapse
|