1
|
Kensen CM, Simões R, Betgen A, Wiersema L, Lambregts DM, Peters FP, Marijnen CA, van der Heide UA, Janssen TM. Incorporating patient-specific information for the development of rectal tumor auto-segmentation models for online adaptive magnetic resonance Image-guided radiotherapy. Phys Imaging Radiat Oncol 2024; 32:100648. [PMID: 39319094 PMCID: PMC11421252 DOI: 10.1016/j.phro.2024.100648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Revised: 08/29/2024] [Accepted: 09/11/2024] [Indexed: 09/26/2024] Open
Abstract
Background and purpose In online adaptive magnetic resonance image (MRI)-guided radiotherapy (MRIgRT), manual contouring of rectal tumors on daily images is labor-intensive and time-consuming. Automation of this task is complex due to substantial variation in tumor shape and location between patients. The aim of this work was to investigate different approaches of propagating patient-specific prior information to the online adaptive treatment fractions to improve deep-learning based auto-segmentation of rectal tumors. Materials and methods 243 T2-weighted MRI scans of 49 rectal cancer patients treated on the 1.5T MR-Linear accelerator (MR-Linac) were utilized to train models to segment rectal tumors. As benchmark, an MRI_only auto-segmentation model was trained. Three approaches of including a patient-specific prior were studied: 1. include the segmentations of fraction 1 as extra input channel for the auto-segmentation of subsequent fractions, 2. fine-tuning of the MRI_only model to fraction 1 (PSF_1) and 3. fine-tuning of the MRI_only model on all earlier fractions (PSF_cumulative). Auto-segmentations were compared to the manual segmentation using geometric similarity metrics. Clinical impact was assessed by evaluating post-treatment target coverage. Results All patient-specific methods outperformed the MRI_only segmentation approach. Median 95th percentile Hausdorff (95HD) were 22.0 (range: 6.1-76.6) mm for MRI_only segmentation, 9.9 (range: 2.5-38.2) mm for MRI+prior segmentation, 6.4 (range: 2.4-17.8) mm for PSF_1 and 4.8 (range: 1.7-26.9) mm for PSF_cumulative. PSF_cumulative was found to be superior to PSF_1 from fraction 4 onward (p = 0.014). Conclusion Patient-specific fine-tuning of automatically segmented rectal tumors, using images and segmentations from all previous fractions, yields superior quality compared to other auto-segmentation approaches.
Collapse
Affiliation(s)
- Chavelli M. Kensen
- Department of Radiation Oncology, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, the Netherlands
| | - Rita Simões
- Department of Radiation Oncology, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, the Netherlands
| | - Anja Betgen
- Department of Radiation Oncology, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, the Netherlands
| | - Lisa Wiersema
- Department of Radiation Oncology, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, the Netherlands
| | - Doenja M.J. Lambregts
- Department of Radiology, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, the Netherlands
| | - Femke P. Peters
- Department of Radiation Oncology, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, the Netherlands
| | - Corrie A.M. Marijnen
- Department of Radiation Oncology, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, the Netherlands
| | - Uulke A. van der Heide
- Department of Radiation Oncology, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, the Netherlands
| | - Tomas M. Janssen
- Department of Radiation Oncology, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, the Netherlands
| |
Collapse
|
2
|
Zhang Y, Amjad A, Ding J, Sarosiek C, Zarenia M, Conlin R, Hall WA, Erickson B, Paulson E. Comprehensive Clinical Usability-Oriented Contour Quality Evaluation for Deep Learning Auto-segmentation: Combining Multiple Quantitative Metrics Through Machine Learning. Pract Radiat Oncol 2024:S1879-8500(24)00204-2. [PMID: 39233005 DOI: 10.1016/j.prro.2024.07.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Revised: 06/07/2024] [Accepted: 07/30/2024] [Indexed: 09/06/2024]
Abstract
PURPOSE The current commonly used metrics for evaluating the quality of auto-segmented contours have limitations and do not always reflect the clinical usefulness of the contours. This work aims to develop a novel contour quality classification (CQC) method by combining multiple quantitative metrics for clinical usability-oriented contour quality evaluation for deep learning-based auto-segmentation (DLAS). METHODS AND MATERIALS The CQC was designed to categorize contours on slices as acceptable, minor edit, or major edit based on the expected editing effort/time with supervised ensemble tree classification models using 7 quantitative metrics. Organ-specific models were trained for 5 abdominal organs (pancreas, duodenum, stomach, small, and large bowels) using 50 magnetic resonance imaging (MRI) data sets. Twenty additional MRI and 9 computed tomography (CT) data sets were employed for testing. Interobserver variation (IOV) was assessed among 6 observers and consensus labels were established through majority vote for evaluation. The CQC was also compared with a threshold-based baseline approach. RESULTS For the 5 organs, the average area under the curve was 0.982 ± 0.01 and 0.979 ± 0.01, the mean accuracy was 95.8% ± 1.7% and 94.3% ± 2.1%, and the mean risk rate was 0.8% ± 0.4% and 0.7% ± 0.5% for MRI and CT testing data set, respectively. The CQC results closely matched the IOV results (mean accuracy of 94.2% ± 0.8% and 94.8% ± 1.7%) and were significantly higher than those obtained using the threshold-based method (mean accuracy of 80.0% ± 4.7%, 83.8% ± 5.2%, and 77.3% ± 6.6% using 1, 2, and 3 metrics). CONCLUSIONS The CQC models demonstrated high performance in classifying the quality of contour slices. This method can address the limitations of existing metrics and offers an intuitive and comprehensive solution for clinically oriented evaluation and comparison of DLAS systems.
Collapse
Affiliation(s)
- Ying Zhang
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, Wisconsin; Department of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, Texas.
| | - Asma Amjad
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Jie Ding
- Department of Radiation Oncology, Emory University School of Medicine, Atlanta, Georgia
| | - Christina Sarosiek
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Mohammad Zarenia
- Department of Radiation Medicine, MedStar Georgetown University Hospital, Washington, District of Columbia
| | - Renae Conlin
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - William A Hall
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Beth Erickson
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Eric Paulson
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, Wisconsin
| |
Collapse
|
3
|
Hurkmans C, Bibault JE, Brock KK, van Elmpt W, Feng M, David Fuller C, Jereczek-Fossa BA, Korreman S, Landry G, Madesta F, Mayo C, McWilliam A, Moura F, Muren LP, El Naqa I, Seuntjens J, Valentini V, Velec M. A joint ESTRO and AAPM guideline for development, clinical validation and reporting of artificial intelligence models in radiation therapy. Radiother Oncol 2024; 197:110345. [PMID: 38838989 DOI: 10.1016/j.radonc.2024.110345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Accepted: 05/23/2024] [Indexed: 06/07/2024]
Abstract
BACKGROUND AND PURPOSE Artificial Intelligence (AI) models in radiation therapy are being developed with increasing pace. Despite this, the radiation therapy community has not widely adopted these models in clinical practice. A cohesive guideline on how to develop, report and clinically validate AI algorithms might help bridge this gap. METHODS AND MATERIALS A Delphi process with all co-authors was followed to determine which topics should be addressed in this comprehensive guideline. Separate sections of the guideline, including Statements, were written by subgroups of the authors and discussed with the whole group at several meetings. Statements were formulated and scored as highly recommended or recommended. RESULTS The following topics were found most relevant: Decision making, image analysis, volume segmentation, treatment planning, patient specific quality assurance of treatment delivery, adaptive treatment, outcome prediction, training, validation and testing of AI model parameters, model availability for others to verify, model quality assurance/updates and upgrades, ethics. Key references were given together with an outlook on current hurdles and possibilities to overcome these. 19 Statements were formulated. CONCLUSION A cohesive guideline has been written which addresses main topics regarding AI in radiation therapy. It will help to guide development, as well as transparent and consistent reporting and validation of new AI tools and facilitate adoption.
Collapse
Affiliation(s)
- Coen Hurkmans
- Department of Radiation Oncology, Catharina Hospital, Eindhoven, the Netherlands; Department of Electrical Engineering, Technical University Eindhoven, Eindhoven, the Netherlands.
| | | | - Kristy K Brock
- Departments of Imaging Physics and Radiation Physics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Wouter van Elmpt
- Department of Radiation Oncology (MAASTRO), GROW - School for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, the Netherlands
| | - Mary Feng
- University of California San Francisco, San Francisco, CA, USA
| | - Clifton David Fuller
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer, Houston, TX
| | - Barbara A Jereczek-Fossa
- Dept. of Oncology and Hemato-oncology, University of Milan, Milan, Italy; Dept. of Radiation Oncology, IEO European Institute of Oncology IRCCS, Milan, Italy
| | - Stine Korreman
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark; Danish Center for Particle Therapy, Aarhus University Hospital, Aarhus, Denmark
| | - Guillaume Landry
- Department of Radiation Oncology, LMU University Hospital, LMU Munich, Munich, Germany; German Cancer Consortium (DKTK), Partner Site Munich, a Partnership between DKFZ and LMU University Hospital Munich, Germany; Bavarian Cancer Research Center (BZKF), Partner Site Munich, Munich, Germany
| | - Frederic Madesta
- Department of Computational Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg, Germany; Institute for Applied Medical Informatics, University Medical Center Hamburg-Eppendorf, Hamburg, Germany; Center for Biomedical Artificial Intelligence (bAIome), University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Chuck Mayo
- Institute for Healthcare Policy and Innovation, University of Michigan, USA
| | - Alan McWilliam
- Division of Cancer Sciences, The University of Manchester, Manchester, UK
| | - Filipe Moura
- CrossI&D Lisbon Research Center, Portuguese Red Cross Higher Health School Lisbon, Portugal
| | - Ludvig P Muren
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark; Danish Center for Particle Therapy, Aarhus University Hospital, Aarhus, Denmark
| | - Issam El Naqa
- Department of Machine Learning, Moffitt Cancer Center, Tampa, FL 33612, USA
| | - Jan Seuntjens
- Princess Margaret Cancer Centre, Radiation Medicine Program, University Health Network & Departments of Radiation Oncology and Medical Biophysics, University of Toronto, Toronto, Canada
| | - Vincenzo Valentini
- Department of Diagnostic Imaging, Oncological Radiotherapy and Hematology, Fondazione Policlinico Universitario "Agostino Gemelli" IRCCS, Rome, Italy; Università Cattolica del Sacro Cuore, Rome, Italy
| | - Michael Velec
- Radiation Medicine Program, Princess Margaret Cancer Centre and Department of Radiation Oncology, University of Toronto, Toronto, Canada
| |
Collapse
|
4
|
Johnson CL, Press RH, Simone CB, Shen B, Tsai P, Hu L, Yu F, Apinorasethkul C, Ackerman C, Zhai H, Lin H, Huang S. Clinical validation of commercial deep-learning based auto-segmentation models for organs at risk in the head and neck region: a single institution study. Front Oncol 2024; 14:1375096. [PMID: 39055552 PMCID: PMC11269179 DOI: 10.3389/fonc.2024.1375096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Accepted: 06/20/2024] [Indexed: 07/27/2024] Open
Abstract
Purpose To evaluate organ at risk (OAR) auto-segmentation in the head and neck region of computed tomography images using two different commercially available deep-learning-based auto-segmentation (DLAS) tools in a single institutional clinical applications. Methods Twenty-two OARs were manually contoured by clinicians according to published guidelines on planning computed tomography (pCT) images for 40 clinical head and neck cancer (HNC) cases. Automatic contours were generated for each patient using two deep-learning-based auto-segmentation models-Manteia AccuContour and MIM ProtégéAI. The accuracy and integrity of autocontours (ACs) were then compared to expert contours (ECs) using the Sørensen-Dice similarity coefficient (DSC) and Mean Distance (MD) metrics. Results ACs were generated for 22 OARs using AccuContour and 17 OARs using ProtégéAI with average contour generation time of 1 min/patient and 5 min/patient respectively. EC and AC agreement was highest for the mandible (DSC 0.90 ± 0.16) and (DSC 0.91 ± 0.03), and lowest for the chiasm (DSC 0.28 ± 0.14) and (DSC 0.30 ± 0.14) for AccuContour and ProtégéAI respectively. Using AccuContour, the average MD was<1mm for 10 of the 22 OARs contoured, 1-2mm for 6 OARs, and 2-3mm for 6 OARs. For ProtégéAI, the average mean distance was<1mm for 8 out of 17 OARs, 1-2mm for 6 OARs, and 2-3mm for 3 OARs. Conclusions Both DLAS programs were proven to be valuable tools to significantly reduce the time required to generate large amounts of OAR contours in the head and neck region, even though manual editing of ACs is likely needed prior to implementation into treatment planning. The DSCs and MDs achieved were similar to those reported in other studies that evaluated various other DLAS solutions. Still, small volume structures with nonideal contrast in CT images, such as nerves, are very challenging and will require additional solutions to achieve sufficient results.
Collapse
Affiliation(s)
| | | | | | - Brian Shen
- New York Proton Center, New York, NY, United States
| | | | - Lei Hu
- New York Proton Center, New York, NY, United States
| | - Francis Yu
- New York Proton Center, New York, NY, United States
| | | | | | - Huifang Zhai
- New York Proton Center, New York, NY, United States
| | - Haibo Lin
- New York Proton Center, New York, NY, United States
| | - Sheng Huang
- New York Proton Center, New York, NY, United States
- National Clinical Research Center for Cancer, Tianjin’s Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute & Hospital, Tianjin, China
| |
Collapse
|
5
|
Kawamoto S, Zhu Z, Chu LC, Javed AA, Kinny-Köster B, Wolfgang CL, Hruban RH, Kinzler KW, Fouladi DF, Blanco A, Shayesteh S, Fishman EK. Deep neural network-based segmentation of normal and abnormal pancreas on abdominal CT: evaluation of global and local accuracies. Abdom Radiol (NY) 2024; 49:501-511. [PMID: 38102442 DOI: 10.1007/s00261-023-04122-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 10/30/2023] [Accepted: 11/03/2023] [Indexed: 12/17/2023]
Abstract
PURPOSE Delay in diagnosis can contribute to poor outcomes in pancreatic ductal adenocarcinoma (PDAC), and new tools for early detection are required. Recent application of artificial intelligence to cancer imaging has demonstrated great potential in detecting subtle early lesions. The aim of the study was to evaluate global and local accuracies of deep neural network (DNN) segmentation of normal and abnormal pancreas with pancreatic mass. METHODS Our previously developed and reported residual deep supervision network for segmentation of PDAC was applied to segment pancreas using CT images of potential renal donors (normal pancreas) and patients with suspected PDAC (abnormal pancreas). Accuracy of DNN pancreas segmentation was assessed using DICE simulation coefficient (DSC), average symmetric surface distance (ASSD), and Hausdorff distance 95% percentile (HD95) as compared to manual segmentation. Furthermore, two radiologists semi-quantitatively assessed local accuracies and estimated volume of correctly segmented pancreas. RESULTS Forty-two normal and 49 abnormal CTs were assessed. Average DSC was 87.4 ± 3.1% and 85.5 ± 3.2%, ASSD 0.97 ± 0.30 and 1.34 ± 0.65, HD95 4.28 ± 2.36 and 6.31 ± 6.31 for normal and abnormal pancreas, respectively. Semi-quantitatively, ≥95% of pancreas volume was correctly segmented in 95.2% and 53.1% of normal and abnormal pancreas by both radiologists, and 97.6% and 75.5% by at least one radiologist. Most common segmentation errors were made on pancreatic and duodenal borders in both groups, and related to pancreatic tumor including duct dilatation, atrophy, tumor infiltration and collateral vessels. CONCLUSION Pancreas DNN segmentation is accurate in a majority of cases, however, minor manual editing may be necessary; particularly in abnormal pancreas.
Collapse
Affiliation(s)
- Satomi Kawamoto
- The Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, 601 N. Caroline Street, Baltimore, MD, 21287, USA.
| | - Zhuotun Zhu
- The Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, 601 N. Caroline Street, Baltimore, MD, 21287, USA
| | - Linda C Chu
- The Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, 601 N. Caroline Street, Baltimore, MD, 21287, USA
| | - Ammar A Javed
- Department of Surgery, School of Medicine, Johns Hopkins University, Blalock Building, 600 N. Wolfe Street, Baltimore, MD, 21287, USA
| | - Benedict Kinny-Köster
- Department of Surgery, School of Medicine, Johns Hopkins University, Blalock Building, 600 N. Wolfe Street, Baltimore, MD, 21287, USA
- Department of General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Heidelberg, Germany
| | - Christopher L Wolfgang
- Department of Surgery, School of Medicine, Johns Hopkins University, Blalock Building, 600 N. Wolfe Street, Baltimore, MD, 21287, USA
| | - Ralph H Hruban
- Department of Pathology, The Sol Goldman Pancreatic Cancer Research Center, Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA
- The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, 21231, USA
| | - Kenneth W Kinzler
- The Ludwig Center, The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, 21231, USA
| | - Daniel Fadaei Fouladi
- The Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, 601 N. Caroline Street, Baltimore, MD, 21287, USA
| | - Alejandra Blanco
- The Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, 601 N. Caroline Street, Baltimore, MD, 21287, USA
| | - Shahab Shayesteh
- The Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, 601 N. Caroline Street, Baltimore, MD, 21287, USA
| | - Elliot K Fishman
- The Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, 601 N. Caroline Street, Baltimore, MD, 21287, USA
| |
Collapse
|
6
|
Chen L, Platzer P, Reschl C, Schafasand M, Nachankar A, Lukas Hajdusich C, Kuess P, Stock M, Habraken S, Carlino A. Validation of a deep-learning segmentation model for adult and pediatric head and neck radiotherapy in different patient positions. Phys Imaging Radiat Oncol 2024; 29:100527. [PMID: 38222671 PMCID: PMC10787237 DOI: 10.1016/j.phro.2023.100527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 12/15/2023] [Accepted: 12/18/2023] [Indexed: 01/16/2024] Open
Abstract
Background and purpose Autocontouring for radiotherapy has the potential to significantly save time and reduce interobserver variability. We aimed to assess the performance of a commercial autocontouring model for head and neck (H&N) patients in eight orientations relevant to particle therapy with fixed beam lines, focusing on validation and implementation for routine clinical use. Materials and methods Autocontouring was performed on sixteen organs at risk (OARs) for 98 adult and pediatric patients with 137 H&N CT scans in eight orientations. A geometric comparison of the autocontours and manual segmentations was performed using the Hausdorff Distance 95th percentile, Dice Similarity Coefficient (DSC) and surface DSC and compared to interobserver variability where available. Additional qualitative scoring and dose-volume-histogram (DVH) parameters analyses were performed for twenty patients in two positions, consisting of scoring on a 0-3 scale based on clinical usability and comparing the mean (Dmean) and near-maximum (D2%) dose, respectively. Results For the geometric analysis, the model performance in head-first-supine straight and hyperextended orientations was in the same range as the interobserver variability. HD95, DSC and surface DSC was heterogeneous in other orientations. No significant geometric differences were found between pediatric and adult autocontours. The qualitative scoring yielded a median score of ≥ 2 for 13/16 OARs while 7/32 DVH parameters were significantly different. Conclusions For head-first-supine straight and hyperextended scans, we found that 13/16 OAR autocontours were suited for use in daily clinical practice and subsequently implemented. Further development is needed for other patient orientations before implementation.
Collapse
Affiliation(s)
- Linda Chen
- MedAustron Ion Therapy Center, Department of Medical Physics, Wiener Neustadt, Austria
- Erasmus MC Cancer Institute, University Medical Center, Department of Radiotherapy, Rotterdam, the Netherlands
- Delft University of Technology, Faculty of Mechanical, Maritime and Materials Engineering, Delft, the Netherlands
- Leiden University Medical Center, Faculty of Medicine, Leiden, the Netherlands
| | - Patricia Platzer
- MedAustron Ion Therapy Center, Department of Medical Physics, Wiener Neustadt, Austria
- Fachhochschule Wiener Neustadt, Department MedTech, Wiener Neustadt, Austria
| | - Christian Reschl
- MedAustron Ion Therapy Center, Department of Medical Physics, Wiener Neustadt, Austria
| | - Mansure Schafasand
- MedAustron Ion Therapy Center, Department of Medical Physics, Wiener Neustadt, Austria
- Medical University of Vienna, Department of Radiation Oncology, Vienna, Austria
- Karl Landsteiner University of Health Sciences, Department of Oncology, Krems an der Donau, Austria
| | - Ankita Nachankar
- MedAustron Ion Therapy Center, Department of Medical Physics, Wiener Neustadt, Austria
- ACMIT Gmbh, Department of Medicine, Wiener Neustadt, Austria
| | | | - Peter Kuess
- Medical University of Vienna, Department of Radiation Oncology, Vienna, Austria
| | - Markus Stock
- MedAustron Ion Therapy Center, Department of Medical Physics, Wiener Neustadt, Austria
- Karl Landsteiner University of Health Sciences, Department of Oncology, Krems an der Donau, Austria
| | - Steven Habraken
- Erasmus MC Cancer Institute, University Medical Center, Department of Radiotherapy, Rotterdam, the Netherlands
- Holland Proton Therapy Center, Department of Medical Physics & Informatics, Delft, the Netherlands
| | - Antonio Carlino
- MedAustron Ion Therapy Center, Department of Medical Physics, Wiener Neustadt, Austria
| |
Collapse
|
7
|
De Kerf G, Claessens M, Raouassi F, Mercier C, Stas D, Ost P, Dirix P, Verellen D. A geometry and dose-volume based performance monitoring of artificial intelligence models in radiotherapy treatment planning for prostate cancer. Phys Imaging Radiat Oncol 2023; 28:100494. [PMID: 37809056 PMCID: PMC10550805 DOI: 10.1016/j.phro.2023.100494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 09/20/2023] [Accepted: 09/20/2023] [Indexed: 10/10/2023] Open
Abstract
Background and Purpose Clinical Artificial Intelligence (AI) implementations lack ground-truth when applied on real-world data. This study investigated how combined geometrical and dose-volume metrics can be used as performance monitoring tools to detect clinically relevant candidates for model retraining. Materials and Methods Fifty patients were analyzed for both AI-segmentation and planning. For AI-segmentation, geometrical (Standard Surface Dice 3 mm and Local Surface Dice 3 mm) and dose-volume based parameters were calculated for two organs (bladder and anorectum) to compare AI output against the clinically corrected structure. A Local Surface Dice was introduced to detect geometrical changes in the vicinity of the target volumes, while an Absolute Dose Difference (ADD) evaluation increased focus on dose-volume related changes. AI-planning performance was evaluated using clinical goal analysis in combination with volume and target overlap metrics. Results The Local Surface Dice reported equal or lower values compared to the Standard Surface Dice (anorectum: (0.93 ± 0.11) vs (0.98 ± 0.04); bladder: (0.97 ± 0.06) vs (0.98 ± 0.04)). The ADD metric showed a difference of (0.9 ± 0.8)Gy for the anorectum D 1 cm 3 . The bladder D 5cm 3 reported a difference of (0.7 ± 1.5)Gy. Mandatory clinical goals were fulfilled in 90 % of the DLP plans. Conclusions Combining dose-volume and geometrical metrics allowed detection of clinically relevant changes, applied to both auto-segmentation and auto-planning output and the Local Surface Dice was more sensitive to local changes compared to the Standard Surface Dice. This monitoring is able to evaluate AI behavior in clinical practice and allows candidate selection for active learning.
Collapse
Affiliation(s)
- Geert De Kerf
- Department of Radiation Oncology, Iridium Netwerk, Wilrijk (Antwerp), Belgium
| | - Michaël Claessens
- Department of Radiation Oncology, Iridium Netwerk, Wilrijk (Antwerp), Belgium
- Centre for Oncological Research (CORE), Integrated Personalized and Precision Oncology Network (IPPON), University of Antwerp, Antwerp, Belgium
| | - Fadoua Raouassi
- Department of Radiation Oncology, Iridium Netwerk, Wilrijk (Antwerp), Belgium
| | - Carole Mercier
- Department of Radiation Oncology, Iridium Netwerk, Wilrijk (Antwerp), Belgium
- Centre for Oncological Research (CORE), Integrated Personalized and Precision Oncology Network (IPPON), University of Antwerp, Antwerp, Belgium
| | - Daan Stas
- Department of Radiation Oncology, Iridium Netwerk, Wilrijk (Antwerp), Belgium
- Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium
| | - Piet Ost
- Department of Radiation Oncology, Iridium Netwerk, Wilrijk (Antwerp), Belgium
- Centre for Oncological Research (CORE), Integrated Personalized and Precision Oncology Network (IPPON), University of Antwerp, Antwerp, Belgium
| | - Piet Dirix
- Department of Radiation Oncology, Iridium Netwerk, Wilrijk (Antwerp), Belgium
- Centre for Oncological Research (CORE), Integrated Personalized and Precision Oncology Network (IPPON), University of Antwerp, Antwerp, Belgium
| | - Dirk Verellen
- Department of Radiation Oncology, Iridium Netwerk, Wilrijk (Antwerp), Belgium
- Centre for Oncological Research (CORE), Integrated Personalized and Precision Oncology Network (IPPON), University of Antwerp, Antwerp, Belgium
| |
Collapse
|
8
|
Doolan PJ, Charalambous S, Roussakis Y, Leczynski A, Peratikou M, Benjamin M, Ferentinos K, Strouthos I, Zamboglou C, Karagiannis E. A clinical evaluation of the performance of five commercial artificial intelligence contouring systems for radiotherapy. Front Oncol 2023; 13:1213068. [PMID: 37601695 PMCID: PMC10436522 DOI: 10.3389/fonc.2023.1213068] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 07/17/2023] [Indexed: 08/22/2023] Open
Abstract
Purpose/objectives Auto-segmentation with artificial intelligence (AI) offers an opportunity to reduce inter- and intra-observer variability in contouring, to improve the quality of contours, as well as to reduce the time taken to conduct this manual task. In this work we benchmark the AI auto-segmentation contours produced by five commercial vendors against a common dataset. Methods and materials The organ at risk (OAR) contours generated by five commercial AI auto-segmentation solutions (Mirada (Mir), MVision (MV), Radformation (Rad), RayStation (Ray) and TheraPanacea (Ther)) were compared to manually-drawn expert contours from 20 breast, 20 head and neck, 20 lung and 20 prostate patients. Comparisons were made using geometric similarity metrics including volumetric and surface Dice similarity coefficient (vDSC and sDSC), Hausdorff distance (HD) and Added Path Length (APL). To assess the time saved, the time taken to manually draw the expert contours, as well as the time to correct the AI contours, were recorded. Results There are differences in the number of CT contours offered by each AI auto-segmentation solution at the time of the study (Mir 99; MV 143; Rad 83; Ray 67; Ther 86), with all offering contours of some lymph node levels as well as OARs. Averaged across all structures, the median vDSCs were good for all systems and compared favorably with existing literature: Mir 0.82; MV 0.88; Rad 0.86; Ray 0.87; Ther 0.88. All systems offer substantial time savings, ranging between: breast 14-20 mins; head and neck 74-93 mins; lung 20-26 mins; prostate 35-42 mins. The time saved, averaged across all structures, was similar for all systems: Mir 39.8 mins; MV 43.6 mins; Rad 36.6 min; Ray 43.2 mins; Ther 45.2 mins. Conclusions All five commercial AI auto-segmentation solutions evaluated in this work offer high quality contours in significantly reduced time compared to manual contouring, and could be used to render the radiotherapy workflow more efficient and standardized.
Collapse
Affiliation(s)
- Paul J. Doolan
- Department of Medical Physics, German Oncology Center, Limassol, Cyprus
| | | | - Yiannis Roussakis
- Department of Medical Physics, German Oncology Center, Limassol, Cyprus
| | - Agnes Leczynski
- Department of Radiation Oncology, German Oncology Center, Limassol, Cyprus
| | - Mary Peratikou
- Department of Radiation Oncology, German Oncology Center, Limassol, Cyprus
| | - Melka Benjamin
- Department of Radiation Oncology, German Oncology Center, Limassol, Cyprus
| | - Konstantinos Ferentinos
- Department of Radiation Oncology, German Oncology Center, Limassol, Cyprus
- School of Medicine, European University Cyprus, Nicosia, Cyprus
| | - Iosif Strouthos
- Department of Radiation Oncology, German Oncology Center, Limassol, Cyprus
- School of Medicine, European University Cyprus, Nicosia, Cyprus
| | - Constantinos Zamboglou
- Department of Radiation Oncology, German Oncology Center, Limassol, Cyprus
- School of Medicine, European University Cyprus, Nicosia, Cyprus
- Department of Radiation Oncology, Medical Center – University of Freiberg, Freiberg, Germany
| | - Efstratios Karagiannis
- Department of Radiation Oncology, German Oncology Center, Limassol, Cyprus
- School of Medicine, European University Cyprus, Nicosia, Cyprus
| |
Collapse
|
9
|
Amjad A, Xu J, Thill D, Zhang Y, Ding J, Paulson E, Hall W, Erickson BA, Li XA. Deep learning auto-segmentation on multi-sequence magnetic resonance images for upper abdominal organs. Front Oncol 2023; 13:1209558. [PMID: 37483486 PMCID: PMC10358771 DOI: 10.3389/fonc.2023.1209558] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 06/19/2023] [Indexed: 07/25/2023] Open
Abstract
Introduction Multi-sequence multi-parameter MRIs are often used to define targets and/or organs at risk (OAR) in radiation therapy (RT) planning. Deep learning has so far focused on developing auto-segmentation models based on a single MRI sequence. The purpose of this work is to develop a multi-sequence deep learning based auto-segmentation (mS-DLAS) based on multi-sequence abdominal MRIs. Materials and methods Using a previously developed 3DResUnet network, a mS-DLAS model using 4 T1 and T2 weighted MRI acquired during routine RT simulation for 71 cases with abdominal tumors was trained and tested. Strategies including data pre-processing, Z-normalization approach, and data augmentation were employed. Additional 2 sequence specific T1 weighted (T1-M) and T2 weighted (T2-M) models were trained to evaluate performance of sequence-specific DLAS. Performance of all models was quantitatively evaluated using 6 surface and volumetric accuracy metrics. Results The developed DLAS models were able to generate reasonable contours of 12 upper abdomen organs within 21 seconds for each testing case. The 3D average values of dice similarity coefficient (DSC), mean distance to agreement (MDA mm), 95 percentile Hausdorff distance (HD95% mm), percent volume difference (PVD), surface DSC (sDSC), and relative added path length (rAPL mm/cc) over all organs were 0.87, 1.79, 7.43, -8.95, 0.82, and 12.25, respectively, for mS-DLAS model. Collectively, 71% of the auto-segmented contours by the three models had relatively high quality. Additionally, the obtained mS-DLAS successfully segmented 9 out of 16 MRI sequences that were not used in the model training. Conclusion We have developed an MRI-based mS-DLAS model for auto-segmenting of upper abdominal organs on MRI. Multi-sequence segmentation is desirable in routine clinical practice of RT for accurate organ and target delineation, particularly for abdominal tumors. Our work will act as a stepping stone for acquiring fast and accurate segmentation on multi-contrast MRI and make way for MR only guided radiation therapy.
Collapse
Affiliation(s)
- Asma Amjad
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, WI, United States
| | | | - Dan Thill
- Elekta Inc., ST. Charles, MO, United States
| | - Ying Zhang
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, WI, United States
| | - Jie Ding
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, WI, United States
| | - Eric Paulson
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, WI, United States
| | - William Hall
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, WI, United States
| | - Beth A. Erickson
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, WI, United States
| | - X. Allen Li
- Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, WI, United States
| |
Collapse
|
10
|
Bakx N, van der Sangen M, Theuws J, Bluemink H, Hurkmans C. Comparison of the output of a deep learning segmentation model for locoregional breast cancer radiotherapy trained on 2 different datasets. Tech Innov Patient Support Radiat Oncol 2023; 26:100209. [PMID: 37213441 PMCID: PMC10199413 DOI: 10.1016/j.tipsro.2023.100209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 04/06/2023] [Accepted: 05/09/2023] [Indexed: 05/23/2023] Open
Abstract
Introduction The development of deep learning (DL) models for auto-segmentation is increasing and more models become commercially available. Mostly, commercial models are trained on external data. To study the effect of using a model trained on external data, compared to the same model trained on in-house collected data, the performance of these two DL models was evaluated. Methods The evaluation was performed using in-house collected data of 30 breast cancer patients. Quantitative analysis was performed using Dice similarity coefficient (DSC), surface DSC (sDSC) and 95th percentile of Hausdorff Distance (95% HD). These values were compared with previously reported inter-observer variations (IOV). Results For a number of structures, statistically significant differences were found between the two models. For organs at risk, mean values for DSC ranged from 0.63 to 0.98 and 0.71 to 0.96 for the in-house and external model, respectively. For target volumes, mean DSC values of 0.57 to 0.94 and 0.33 to 0.92 were found. The difference of 95% HD values ranged 0.08 to 3.23 mm between the two models, except for CTVn4 with 9.95 mm. For the external model, both DSC and 95% HD are outside the range of IOV for CTVn4, whereas this is the case for the DSC found for the thyroid of the in-house model. Conclusions Statistically significant differences were found between both models, which were mostly within published inter-observer variations, showing clinical usefulness of both models. Our findings could encourage discussion and revision of existing guidelines, to further decrease inter-observer, but also inter-institute variability.
Collapse
Affiliation(s)
- Nienke Bakx
- Catharina Hospital, Department of Radiation Oncology, 5602ZA Eindhoven, the Netherlands
| | | | - Jacqueline Theuws
- Catharina Hospital, Department of Radiation Oncology, 5602ZA Eindhoven, the Netherlands
| | - Hanneke Bluemink
- Catharina Hospital, Department of Radiation Oncology, 5602ZA Eindhoven, the Netherlands
| | - Coen Hurkmans
- Catharina Hospital, Department of Radiation Oncology, 5602ZA Eindhoven, the Netherlands
- Technical University Eindhoven, Faculties of Physics and Electrical Engineering, 5600MB Eindhoven, the Netherlands
| |
Collapse
|
11
|
Lefebvre AL, Yamamoto CAP, Shade JK, Bradley RP, Yu RA, Ali RL, Popescu DM, Prakosa A, Kholmovski EG, Trayanova NA. LASSNet: A Four Steps Deep Neural Network for Left Atrial Segmentation and Scar Quantification. LEFT ATRIAL AND SCAR QUANTIFICATION AND SEGMENTATION : FIRST CHALLENGE, LASCARQS 2022 HELD IN CONJUNCTION WITH MICCAI 2022, SINGAPORE, SEPTEMBER 18, 2022, PROCEEDINGS 2023; 13586:1-15. [PMID: 37287952 PMCID: PMC10246435 DOI: 10.1007/978-3-031-31778-1_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Accurate quantification of left atrium (LA) scar in patients with atrial fibrillation is essential to guide successful ablation strategies. Prior to LA scar quantification, a proper LA cavity segmentation is required to ensure exact location of scar. Both tasks can be extremely time-consuming and are subject to inter-observer disagreements when done manually. We developed and validated a deep neural network to automatically segment the LA cavity and the LA scar. The global architecture uses a multi-network sequential approach in two stages which segment the LA cavity and the LA Scar. Each stage has two steps: a region of interest Neural Network and a refined segmentation network. We analysed the performances of our network according to different parameters and applied data triaging. 200+ late gadolinium enhancement magnetic resonance images were provided by the LAScarQS 2022 Challenge. Finally, we compared our performances for scar quantification to the literature and demonstrated improved performances.
Collapse
Affiliation(s)
- Arthur L Lefebvre
- Faculté polytechnique de Mons, UMONS, Mons, Belgium
- Alliance for Cardiovascular Diagnostic and Treatment Innovation (ADVANCE), Johns Hopkins University, Baltimore, MD, USA
| | - Carolyna A P Yamamoto
- Alliance for Cardiovascular Diagnostic and Treatment Innovation (ADVANCE), Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Julie K Shade
- Alliance for Cardiovascular Diagnostic and Treatment Innovation (ADVANCE), Johns Hopkins University, Baltimore, MD, USA
| | - Ryan P Bradley
- Alliance for Cardiovascular Diagnostic and Treatment Innovation (ADVANCE), Johns Hopkins University, Baltimore, MD, USA
| | - Rebecca A Yu
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Rheeda L Ali
- Alliance for Cardiovascular Diagnostic and Treatment Innovation (ADVANCE), Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Dan M Popescu
- Alliance for Cardiovascular Diagnostic and Treatment Innovation (ADVANCE), Johns Hopkins University, Baltimore, MD, USA
| | - Adityo Prakosa
- Alliance for Cardiovascular Diagnostic and Treatment Innovation (ADVANCE), Johns Hopkins University, Baltimore, MD, USA
| | - Eugene G Kholmovski
- Alliance for Cardiovascular Diagnostic and Treatment Innovation (ADVANCE), Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Natalia A Trayanova
- Alliance for Cardiovascular Diagnostic and Treatment Innovation (ADVANCE), Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| |
Collapse
|
12
|
Lin D, Wahid KA, Nelms BE, He R, Naser MA, Duke S, Sherer MV, Christodouleas JP, Mohamed ASR, Cislo M, Murphy JD, Fuller CD, Gillespie EF. E pluribus unum: prospective acceptability benchmarking from the Contouring Collaborative for Consensus in Radiation Oncology crowdsourced initiative for multiobserver segmentation. J Med Imaging (Bellingham) 2023; 10:S11903. [PMID: 36761036 PMCID: PMC9907021 DOI: 10.1117/1.jmi.10.s1.s11903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 01/02/2023] [Indexed: 02/11/2023] Open
Abstract
Purpose Contouring Collaborative for Consensus in Radiation Oncology (C3RO) is a crowdsourced challenge engaging radiation oncologists across various expertise levels in segmentation. An obstacle to artificial intelligence (AI) development is the paucity of multiexpert datasets; consequently, we sought to characterize whether aggregate segmentations generated from multiple nonexperts could meet or exceed recognized expert agreement. Approach Participants who contoured ≥ 1 region of interest (ROI) for the breast, sarcoma, head and neck (H&N), gynecologic (GYN), or gastrointestinal (GI) cases were identified as a nonexpert or recognized expert. Cohort-specific ROIs were combined into single simultaneous truth and performance level estimation (STAPLE) consensus segmentations.STAPLE nonexpert ROIs were evaluated againstSTAPLE expert contours using Dice similarity coefficient (DSC). The expert interobserver DSC (IODSC expert ) was calculated as an acceptability threshold betweenSTAPLE nonexpert andSTAPLE expert . To determine the number of nonexperts required to match theIODSC expert for each ROI, a single consensus contour was generated using variable numbers of nonexperts and then compared to theIODSC expert . Results For all cases, the DSC values forSTAPLE nonexpert versusSTAPLE expert were higher than comparator expertIODSC expert for most ROIs. The minimum number of nonexpert segmentations needed for a consensus ROI to achieveIODSC expert acceptability criteria ranged between 2 and 4 for breast, 3 and 5 for sarcoma, 3 and 5 for H&N, 3 and 5 for GYN, and 3 for GI. Conclusions Multiple nonexpert-generated consensus ROIs met or exceeded expert-derived acceptability thresholds. Five nonexperts could potentially generate consensus segmentations for most ROIs with performance approximating experts, suggesting nonexpert segmentations as feasible cost-effective AI inputs.
Collapse
Affiliation(s)
- Diana Lin
- Memorial Sloan Kettering Cancer Center, Department of Radiation Oncology, New York, New York, United States
| | - Kareem A. Wahid
- The University of Texas MD Anderson Cancer Center, Department of Radiation Oncology, Houston, Texas, United States
| | | | - Renjie He
- The University of Texas MD Anderson Cancer Center, Department of Radiation Oncology, Houston, Texas, United States
| | - Mohammed A. Naser
- The University of Texas MD Anderson Cancer Center, Department of Radiation Oncology, Houston, Texas, United States
| | - Simon Duke
- Cambridge University Hospitals, Department of Radiation Oncology, Cambridge, United Kingdom
| | - Michael V. Sherer
- University of California San Diego, Department of Radiation Medicine and Applied Sciences, La Jolla, California, United States
| | - John P. Christodouleas
- The University of Pennsylvania Cancer Center, Department of Radiation Oncology, Philadelphia, Pennsylvania, United States
- Elekta AB, Stockholm, Sweden
| | - Abdallah S. R. Mohamed
- The University of Texas MD Anderson Cancer Center, Department of Radiation Oncology, Houston, Texas, United States
| | - Michael Cislo
- Memorial Sloan Kettering Cancer Center, Department of Radiation Oncology, New York, New York, United States
| | - James D. Murphy
- University of California San Diego, Department of Radiation Medicine and Applied Sciences, La Jolla, California, United States
| | - Clifton D. Fuller
- The University of Texas MD Anderson Cancer Center, Department of Radiation Oncology, Houston, Texas, United States
| | - Erin F. Gillespie
- Memorial Sloan Kettering Cancer Center, Department of Radiation Oncology, New York, New York, United States
- University of Washington Fred Hutchinson Cancer Center, Department of Radiation Oncology, Seattle, Washington, United States
| |
Collapse
|
13
|
Chung SY, Chang JS, Kim YB. Comprehensive clinical evaluation of deep learning-based auto-segmentation for radiotherapy in patients with cervical cancer. Front Oncol 2023; 13:1119008. [PMID: 37188180 PMCID: PMC10175826 DOI: 10.3389/fonc.2023.1119008] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Accepted: 04/13/2023] [Indexed: 05/17/2023] Open
Abstract
Background and purpose Deep learning-based models have been actively investigated for various aspects of radiotherapy. However, for cervical cancer, only a few studies dealing with the auto-segmentation of organs-at-risk (OARs) and clinical target volumes (CTVs) exist. This study aimed to train a deep learning-based auto-segmentation model for OAR/CTVs for patients with cervical cancer undergoing radiotherapy and to evaluate the model's feasibility and efficacy with not only geometric indices but also comprehensive clinical evaluation. Materials and methods A total of 180 abdominopelvic computed tomography images were included (training set, 165; validation set, 15). Geometric indices such as the Dice similarity coefficient (DSC) and the 95% Hausdorff distance (HD) were analyzed. A Turing test was performed and physicians from other institutions were asked to delineate contours with and without using auto-segmented contours to assess inter-physician heterogeneity and contouring time. Results The correlation between the manual and auto-segmented contours was acceptable for the anorectum, bladder, spinal cord, cauda equina, right and left femoral heads, bowel bag, uterocervix, liver, and left and right kidneys (DSC greater than 0.80). The stomach and duodenum showed DSCs of 0.67 and 0.73, respectively. CTVs showed DSCs between 0.75 and 0.80. Turing test results were favorable for most OARs and CTVs. No auto-segmented contours had large, obvious errors. The median overall satisfaction score of the participating physicians was 7 out of 10. Auto-segmentation reduced heterogeneity and shortened contouring time by 30 min among radiation oncologists from different institutions. Most participants favored the auto-contouring system. Conclusion The proposed deep learning-based auto-segmentation model may be an efficient tool for patients with cervical cancer undergoing radiotherapy. Although the current model may not completely replace humans, it can serve as a useful and efficient tool in real-world clinics.
Collapse
Affiliation(s)
- Seung Yeun Chung
- Department of Radiation Oncology, Yonsei University College of Medicine, Seoul, Republic of Korea
- Department of Radiation Oncology, Ajou University School of Medicine, Suwon, Republic of Korea
| | - Jee Suk Chang
- Department of Radiation Oncology, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Yong Bae Kim
- Department of Radiation Oncology, Yonsei University College of Medicine, Seoul, Republic of Korea
- *Correspondence: Yong Bae Kim,
| |
Collapse
|
14
|
VilasBoas-Ribeiro I, Franckena M, van Rhoon GC, Hernández-Tamames JA, Paulides MM. Using MRI to measure position and anatomy changes and assess their impact on the accuracy of hyperthermia treatment planning for cervical cancer. Int J Hyperthermia 2022; 40:2151648. [PMID: 36535922 DOI: 10.1080/02656736.2022.2151648] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
PURPOSE We studied the differences between planning and treatment position, their impact on the accuracy of hyperthermia treatment planning (HTP) predictions, and the relevance of including true treatment anatomy and position in HTP based on magnetic resonance (MR) images. MATERIALS AND METHODS All volunteers were scanned with an MR-compatible hyperthermia device, including a filled waterbolus, to replicate the treatment setup. In the planning setup, the volunteers were scanned without the device to reproduce the imaging in the current HTP. First, we used rigid registration to investigate the patient position displacements between the planning and treatment setup. Second, we performed HTP for the planning anatomy at both positions and the treatment mimicking anatomy to study the effects of positioning and anatomy on the quality of the simulated hyperthermia treatment. Treatment quality was evaluated using SAR-based parameters. RESULTS We found an average displacement of 2 cm between planning and treatment positions. These displacements caused average absolute differences of ∼12% for TC25 and 10.4%-15.9% in THQ. Furthermore, we found that including the accurate treatment position and anatomy in treatment planning led to an improvement of 2% in TC25 and 4.6%-10.6% in THQ. CONCLUSIONS This study showed that precise patient position and anatomy are relevant since these affect the accuracy of HTP predictions. The major part of improved accuracy is related to implementing the correct position of the patient in the applicator. Hence, our study shows a clear incentive to accurately match the patient position in HTP with the actual treatment.
Collapse
Affiliation(s)
- Iva VilasBoas-Ribeiro
- Department of Radiotherapy, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - Martine Franckena
- Department of Radiotherapy, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - Gerard C van Rhoon
- Department of Radiotherapy, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, The Netherlands.,Department of Applied Radiation and Isotopes, Reactor Institute Delft, Delft University of Technology, Delft, The Netherlands
| | - Juan A Hernández-Tamames
- Department of Radiology and Nuclear Medicine, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - Margarethus M Paulides
- Department of Radiotherapy, Erasmus MC Cancer Institute, University Medical Center Rotterdam, Rotterdam, The Netherlands.,Care and Cure research lab (EM-4C&C) of the Electromagnetics Group, Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| |
Collapse
|
15
|
Claessens M, Vanreusel V, De Kerf G, Mollaert I, Löfman F, Gooding MJ, Brouwer C, Dirix P, Verellen D. Machine learning-based detection of aberrant deep learning segmentations of target and organs at risk for prostate radiotherapy using a secondary segmentation algorithm. Phys Med Biol 2022; 67. [PMID: 35561701 DOI: 10.1088/1361-6560/ac6fad] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Accepted: 05/13/2022] [Indexed: 11/11/2022]
Abstract
Objective.The output of a deep learning (DL) auto-segmentation application should be reviewed, corrected if needed and approved before being used clinically. This verification procedure is labour-intensive, time-consuming and user-dependent, which potentially leads to significant errors with impact on the overall treatment quality. Additionally, when the time needed to correct auto-segmentations approaches the time to delineate target and organs at risk from scratch, the usability of the DL model can be questioned. Therefore, an automated quality assurance framework was developed with the aim to detect in advance aberrant auto-segmentations.Approach. Five organs (prostate, bladder, anorectum, femoral head left and right) were auto-delineated on CT acquisitions for 48 prostate patients by an in-house trained primary DL model. An experienced radiation oncologist assessed the correctness of the model output and categorised the auto-segmentations into two classes whether minor or major adaptations were needed. Subsequently, an independent, secondary DL model was implemented to delineate the same structures as the primary model. Quantitative comparison metrics were calculated using both models' segmentations and used as input features for a machine learning classification model to predict the output quality of the primary model.Main results. For every organ, the approach of independent validation by the secondary model was able to detect primary auto-segmentations that needed major adaptation with high sensitivity (recall = 1) based on the calculated quantitative metrics. The surface DSC and APL were found to be the most indicated parameters in comparison to standard quantitative metrics for the time needed to adapt auto-segmentations.Significance. This proposed method includes a proof of concept for the use of an independent DL segmentation model in combination with a ML classifier to improve time saving during QA of auto-segmentations. The integration of such system into current automatic segmentation pipelines can increase the efficiency of the radiotherapy contouring workflow.
Collapse
Affiliation(s)
- Michaël Claessens
- Department of Radiation Oncology, Iridium Network, Wilrijk (Antwerp), Belgium.,Centre for Oncological Research (CORE), Integrated Personalized and Precision Oncology Network (IPPON), University of Antwerp, Belgium
| | - Verdi Vanreusel
- Department of Radiation Oncology, Iridium Network, Wilrijk (Antwerp), Belgium
| | - Geert De Kerf
- Department of Radiation Oncology, Iridium Network, Wilrijk (Antwerp), Belgium
| | - Isabelle Mollaert
- Department of Radiation Oncology, Iridium Network, Wilrijk (Antwerp), Belgium
| | - Fredrik Löfman
- Department of Machine Learning, RaySearch Laboratories AB, Stockholm, Sweden
| | | | - Charlotte Brouwer
- University of Groningen, University Medical Center Groningen, Department of Radiation Oncology, The Netherlands
| | - Piet Dirix
- Department of Radiation Oncology, Iridium Network, Wilrijk (Antwerp), Belgium.,Centre for Oncological Research (CORE), Integrated Personalized and Precision Oncology Network (IPPON), University of Antwerp, Belgium
| | - Dirk Verellen
- Department of Radiation Oncology, Iridium Network, Wilrijk (Antwerp), Belgium.,Centre for Oncological Research (CORE), Integrated Personalized and Precision Oncology Network (IPPON), University of Antwerp, Belgium
| |
Collapse
|
16
|
Naser MA, Wahid KA, van Dijk LV, He R, Abdelaal MA, Dede C, Mohamed ASR, Fuller CD. Head and Neck Cancer Primary Tumor Auto Segmentation Using Model Ensembling of Deep Learning in PET/CT Images. HEAD AND NECK TUMOR SEGMENTATION AND OUTCOME PREDICTION : SECOND CHALLENGE, HECKTOR 2021, HELD IN CONJUNCTION WITH MICCAI 2021, STRASBOURG, FRANCE, SEPTEMBER 27, 2021, PROCEEDINGS. HEAD AND NECK TUMOR SEGMENTATION CHALLENGE (2ND : 2021 ... 2022; 13209:121-132. [PMID: 35399869 PMCID: PMC8991449 DOI: 10.1007/978-3-030-98253-9_11] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Auto-segmentation of primary tumors in oropharyngeal cancer using PET/CT images is an unmet need that has the potential to improve radiation oncology workflows. In this study, we develop a series of deep learning models based on a 3D Residual Unet (ResUnet) architecture that can segment oropharyngeal tumors with high performance as demonstrated through internal and external validation of large-scale datasets (training size = 224 patients, testing size = 101 patients) as part of the 2021 HECKTOR Challenge. Specifically, we leverage ResUNet models with either 256 or 512 bottleneck layer channels that demonstrate internal validation (10-fold cross-validation) mean Dice similarity coefficient (DSC) up to 0.771 and median 95% Hausdorff distance (95% HD) as low as 2.919 mm. We employ label fusion ensemble approaches, including Simultaneous Truth and Performance Level Estimation (STAPLE) and a voxel-level threshold approach based on majority voting (AVERAGE), to generate consensus segmentations on the test data by combining the segmentations produced through different trained cross-validation models. We demonstrate that our best performing ensembling approach (256 channels AVERAGE) achieves a mean DSC of 0.770 and median 95% HD of 3.143 mm through independent external validation on the test set. Our DSC and 95% HD test results are within 0.01 and 0.06 mm of the top ranked model in the competition, respectively. Concordance of internal and external validation results suggests our models are robust and can generalize well to unseen PET/CT data. We advocate that ResUNet models coupled to label fusion ensembling approaches are promising candidates for PET/CT oropharyngeal primary tumors auto-segmentation. Future investigations should target the ideal combination of channel combinations and label fusion strategies to maximize segmentation performance.
Collapse
Affiliation(s)
- Mohamed A Naser
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer, Houston, TX 77030, USA
| | - Kareem A Wahid
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer, Houston, TX 77030, USA
| | - Lisanne V van Dijk
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer, Houston, TX 77030, USA
| | - Renjie He
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer, Houston, TX 77030, USA
| | - Moamen Abobakr Abdelaal
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer, Houston, TX 77030, USA
| | - Cem Dede
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer, Houston, TX 77030, USA
| | - Abdallah S R Mohamed
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer, Houston, TX 77030, USA
| | - Clifton D Fuller
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer, Houston, TX 77030, USA
| |
Collapse
|
17
|
Amjad A, Xu J, Thill D, Lawton C, Hall W, Awan MJ, Shukla M, Erickson BA, Li XA. General and custom deep learning autosegmentation models for organs in head and neck, abdomen, and male pelvis. Med Phys 2022; 49:1686-1700. [PMID: 35094390 PMCID: PMC8917093 DOI: 10.1002/mp.15507] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Revised: 01/19/2022] [Accepted: 01/21/2022] [Indexed: 11/11/2022] Open
Abstract
PURPOSE To reduce workload and inconsistencies in organ segmentation for radiation treatment planning, we developed and evaluated general and custom autosegmentation models on computed tomography (CT) for three major tumor sites using a well-established deep convolutional neural network (DCNN). METHODS Five CT-based autosegmentation models for 42 organs at risk (OARs) in head and neck (HN), abdomen (ABD), and male pelvis (MP) were developed using a full three-dimensional (3D) DCNN architecture. Two types of deep learning (DL) models were separately trained using either general diversified multi-institutional datasets or custom well-controlled single-institution datasets. To improve segmentation accuracy, an adaptive spatial resolution approach for small and/or narrow OARs and a pseudo scan extension approach, when CT scan length is too short to cover entire organs, were implemented. The performance of the obtained models was evaluated based on accuracy and clinical applicability of the autosegmented contours using qualitative visual inspection and quantitative calculation of dice similarity coefficient (DSC), mean distance to agreement (MDA), and time efficiency. RESULTS The five DL autosegmentation models developed for the three anatomical sites were found to have high accuracy (DSC ranging from 0.8 to 0.98) for 74% OARs and marginally acceptable for 26% OARs. The custom models performed slightly better than the general models, even with smaller custom datasets used for the custom model training. The organ-based approaches improved autosegmentation accuracy for small or complex organs (e.g., eye lens, optic nerves, inner ears, and bowels). Compared with traditional manual contouring times, the autosegmentation times, including subsequent manual editing, if necessary, were substantially reduced by 88% for MP, 80% for HN, and 65% for ABD models. CONCLUSIONS The obtained autosegmentation models, incorporating organ-based approaches, were found to be effective and accurate for most OARs in the male pelvis, head and neck, and abdomen. We have demonstrated that our multianatomical DL autosegmentation models are clinically useful for radiation treatment planning.
Collapse
Affiliation(s)
- Asma Amjad
- Department of Radiation Oncology, Medical College of Wisconsin, WI, USA
| | | | | | - Colleen Lawton
- Department of Radiation Oncology, Medical College of Wisconsin, WI, USA
| | - William Hall
- Department of Radiation Oncology, Medical College of Wisconsin, WI, USA
| | - Musaddiq J. Awan
- Department of Radiation Oncology, Medical College of Wisconsin, WI, USA
| | - Monica Shukla
- Department of Radiation Oncology, Medical College of Wisconsin, WI, USA
| | - Beth A. Erickson
- Department of Radiation Oncology, Medical College of Wisconsin, WI, USA
| | - X. Allen Li
- Department of Radiation Oncology, Medical College of Wisconsin, WI, USA
| |
Collapse
|
18
|
Smith AG, Petersen J, Terrones-Campos C, Berthelsen AK, Forbes NJ, Darkner S, Specht L, Vogelius IR. RootPainter3D: Interactive-machine-learning enables rapid and accurate contouring for radiotherapy. Med Phys 2021; 49:461-473. [PMID: 34783028 DOI: 10.1002/mp.15353] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2021] [Revised: 09/22/2021] [Accepted: 10/28/2021] [Indexed: 11/07/2022] Open
Abstract
PURPOSE Organ-at-risk contouring is still a bottleneck in radiotherapy, with many deep learning methods falling short of promised results when evaluated on clinical data. We investigate the accuracy and time-savings resulting from the use of an interactive-machine-learning method for an organ-at-risk contouring task. METHODS We implement an open-source interactive-machine-learning software application that facilitates corrective-annotation for deep-learning generated contours on X-ray CT images. A trained-physician contoured 933 hearts using our software by delineating the first image, starting model training, and then correcting the model predictions for all subsequent images. These corrections were added into the training data, which was used for continuously training the assisting model. From the 933 hearts, the same physician also contoured the first 10 and last 10 in Eclipse (Varian) to enable comparison in terms of accuracy and duration. RESULTS We find strong agreement with manual delineations, with a dice score of 0.95. The annotations created using corrective-annotation also take less time to create as more images are annotated, resulting in substantial time savings compared to manual methods. After 923 images had been delineated, hearts took 2 min and 2 s to delineate on average, which includes time to evaluate the initial model prediction and assign the needed corrections, compared to 7 min and 1 s when delineating manually. CONCLUSIONS Our experiment demonstrates that interactive-machine-learning with corrective-annotation provides a fast and accessible way for non computer-scientists to train deep-learning models to segment their own structures of interest as part of routine clinical workflows.
Collapse
Affiliation(s)
- Abraham George Smith
- Department of Computer Science, University of Copenhagen, Copenhagen, Denmark.,Department of Oncology, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Jens Petersen
- Department of Computer Science, University of Copenhagen, Copenhagen, Denmark.,Department of Oncology, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Cynthia Terrones-Campos
- Department of Oncology, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark.,Department of Infectious Diseases, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Anne Kiil Berthelsen
- Department of Oncology, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark.,Department of Clinical Physiology, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Nora Jarrett Forbes
- Department of Computer Science, University of Copenhagen, Copenhagen, Denmark.,Department of Oncology, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Sune Darkner
- Department of Computer Science, University of Copenhagen, Copenhagen, Denmark
| | - Lena Specht
- Department of Oncology, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Ivan Richter Vogelius
- Department of Oncology, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark.,Department of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
19
|
Aiello M, Esposito G, Pagliari G, Borrelli P, Brancato V, Salvatore M. How does DICOM support big data management? Investigating its use in medical imaging community. Insights Imaging 2021; 12:164. [PMID: 34748101 PMCID: PMC8574146 DOI: 10.1186/s13244-021-01081-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Accepted: 08/25/2021] [Indexed: 12/15/2022] Open
Abstract
The diagnostic imaging field is experiencing considerable growth, followed by increasing production of massive amounts of data. The lack of standardization and privacy concerns are considered the main barriers to big data capitalization. This work aims to verify whether the advanced features of the DICOM standard, beyond imaging data storage, are effectively used in research practice. This issue will be analyzed by investigating the publicly shared medical imaging databases and assessing how much the most common medical imaging software tools support DICOM in all its potential. Therefore, 100 public databases and ten medical imaging software tools were selected and examined using a systematic approach. In particular, the DICOM fields related to privacy, segmentation and reporting have been assessed in the selected database; software tools have been evaluated for reading and writing the same DICOM fields. From our analysis, less than a third of the databases examined use the DICOM format to record meaningful information to manage the images. Regarding software, the vast majority does not allow the management, reading and writing of some or all the DICOM fields. Surprisingly, if we observe chest computed tomography data sharing to address the COVID-19 emergency, there are only two datasets out of 12 released in DICOM format. Our work shows how the DICOM can potentially fully support big data management; however, further efforts are still needed from the scientific and technological community to promote the use of the existing standard, encouraging data sharing and interoperability for a concrete development of big data analytics.
Collapse
Affiliation(s)
- Marco Aiello
- IRCCS SDN, Via Emanuele Gianturco 113, 80143, Naples, Italy.
| | | | | | | | | | | |
Collapse
|
20
|
Lizzi F, Agosti A, Brero F, Cabini RF, Fantacci ME, Figini S, Lascialfari A, Laruina F, Oliva P, Piffer S, Postuma I, Rinaldi L, Talamonti C, Retico A. Quantification of pulmonary involvement in COVID-19 pneumonia by means of a cascade of two U-nets: training and assessment on multiple datasets using different annotation criteria. Int J Comput Assist Radiol Surg 2021; 17:229-237. [PMID: 34698988 PMCID: PMC8547130 DOI: 10.1007/s11548-021-02501-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Accepted: 09/15/2021] [Indexed: 12/24/2022]
Abstract
Purpose This study aims at exploiting artificial intelligence (AI) for the identification, segmentation and quantification of COVID-19 pulmonary lesions. The limited data availability and the annotation quality are relevant factors in training AI-methods. We investigated the effects of using multiple datasets, heterogeneously populated and annotated according to different criteria. Methods We developed an automated analysis pipeline, the LungQuant system, based on a cascade of two U-nets. The first one (U-net\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$_1$$\end{document}1) is devoted to the identification of the lung parenchyma; the second one (U-net\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$_2$$\end{document}2) acts on a bounding box enclosing the segmented lungs to identify the areas affected by COVID-19 lesions. Different public datasets were used to train the U-nets and to evaluate their segmentation performances, which have been quantified in terms of the Dice Similarity Coefficients. The accuracy in predicting the CT-Severity Score (CT-SS) of the LungQuant system has been also evaluated. Results Both the volumetric DSC (vDSC) and the accuracy showed a dependency on the annotation quality of the released data samples. On an independent dataset (COVID-19-CT-Seg), both the vDSC and the surface DSC (sDSC) were measured between the masks predicted by LungQuant system and the reference ones. The vDSC (sDSC) values of 0.95±0.01 and 0.66±0.13 (0.95±0.02 and 0.76±0.18, with 5 mm tolerance) were obtained for the segmentation of lungs and COVID-19 lesions, respectively. The system achieved an accuracy of 90% in CT-SS identification on this benchmark dataset. Conclusion We analysed the impact of using data samples with different annotation criteria in training an AI-based quantification system for pulmonary involvement in COVID-19 pneumonia. In terms of vDSC measures, the U-net segmentation strongly depends on the quality of the lesion annotations. Nevertheless, the CT-SS can be accurately predicted on independent test sets, demonstrating the satisfactory generalization ability of the LungQuant. Supplementary Information The online version supplementary material available at 10.1007/s11548-021-02501-2.
Collapse
Affiliation(s)
- Francesca Lizzi
- Scuola Normale Superiore, Pisa, Italy. .,National Institute of Nuclear Physics (INFN), Pisa division, Pisa, Italy.
| | - Abramo Agosti
- Department of Mathematics, University of Pavia, Pavia, Italy
| | - Francesca Brero
- INFN, Pavia division, Pavia, Italy.,Department of Physics, University of Pavia, Pavia, Italy
| | - Raffaella Fiamma Cabini
- INFN, Pavia division, Pavia, Italy.,Department of Mathematics, University of Pavia, Pavia, Italy
| | - Maria Evelina Fantacci
- National Institute of Nuclear Physics (INFN), Pisa division, Pisa, Italy.,Department of Physics, University of Pisa, Pisa, Italy
| | - Silvia Figini
- INFN, Pavia division, Pavia, Italy.,Department of Social and Political Science, University of Pavia, Pavia, Italy
| | - Alessandro Lascialfari
- INFN, Pavia division, Pavia, Italy.,Department of Physics, University of Pavia, Pavia, Italy
| | - Francesco Laruina
- Scuola Normale Superiore, Pisa, Italy.,National Institute of Nuclear Physics (INFN), Pisa division, Pisa, Italy
| | - Piernicola Oliva
- Department of Chemistry and Pharmacy, University of Sassari, Sassari, Italy.,INFN, Cagliari division, Cagliari, Italy
| | - Stefano Piffer
- Department of Biomedical Experimental Clinical Science "M. Serio", University of Florence, Florence, Italy.,INFN, Florence division, Florence, Italy
| | | | - Lisa Rinaldi
- INFN, Pavia division, Pavia, Italy.,Department of Physics, University of Pavia, Pavia, Italy
| | - Cinzia Talamonti
- Department of Biomedical Experimental Clinical Science "M. Serio", University of Florence, Florence, Italy.,INFN, Florence division, Florence, Italy
| | - Alessandra Retico
- National Institute of Nuclear Physics (INFN), Pisa division, Pisa, Italy
| |
Collapse
|
21
|
Nikolov S, Blackwell S, Zverovitch A, Mendes R, Livne M, De Fauw J, Patel Y, Meyer C, Askham H, Romera-Paredes B, Kelly C, Karthikesalingam A, Chu C, Carnell D, Boon C, D'Souza D, Moinuddin SA, Garie B, McQuinlan Y, Ireland S, Hampton K, Fuller K, Montgomery H, Rees G, Suleyman M, Back T, Hughes CO, Ledsam JR, Ronneberger O. Clinically Applicable Segmentation of Head and Neck Anatomy for Radiotherapy: Deep Learning Algorithm Development and Validation Study. J Med Internet Res 2021; 23:e26151. [PMID: 34255661 PMCID: PMC8314151 DOI: 10.2196/26151] [Citation(s) in RCA: 118] [Impact Index Per Article: 39.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 02/10/2021] [Accepted: 04/30/2021] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Over half a million individuals are diagnosed with head and neck cancer each year globally. Radiotherapy is an important curative treatment for this disease, but it requires manual time to delineate radiosensitive organs at risk. This planning process can delay treatment while also introducing interoperator variability, resulting in downstream radiation dose differences. Although auto-segmentation algorithms offer a potentially time-saving solution, the challenges in defining, quantifying, and achieving expert performance remain. OBJECTIVE Adopting a deep learning approach, we aim to demonstrate a 3D U-Net architecture that achieves expert-level performance in delineating 21 distinct head and neck organs at risk commonly segmented in clinical practice. METHODS The model was trained on a data set of 663 deidentified computed tomography scans acquired in routine clinical practice and with both segmentations taken from clinical practice and segmentations created by experienced radiographers as part of this research, all in accordance with consensus organ at risk definitions. RESULTS We demonstrated the model's clinical applicability by assessing its performance on a test set of 21 computed tomography scans from clinical practice, each with 21 organs at risk segmented by 2 independent experts. We also introduced surface Dice similarity coefficient, a new metric for the comparison of organ delineation, to quantify the deviation between organ at risk surface contours rather than volumes, better reflecting the clinical task of correcting errors in automated organ segmentations. The model's generalizability was then demonstrated on 2 distinct open-source data sets, reflecting different centers and countries to model training. CONCLUSIONS Deep learning is an effective and clinically applicable technique for the segmentation of the head and neck anatomy for radiotherapy. With appropriate validation studies and regulatory approvals, this system could improve the efficiency, consistency, and safety of radiotherapy pathways.
Collapse
Affiliation(s)
| | | | | | - Ruheena Mendes
- University College London Hospitals NHS Foundation Trust, London, United Kingdom
| | | | | | | | | | | | | | | | | | | | - Dawn Carnell
- University College London Hospitals NHS Foundation Trust, London, United Kingdom
| | - Cheng Boon
- Clatterbridge Cancer Centre NHS Foundation Trust, Liverpool, United Kingdom
| | - Derek D'Souza
- University College London Hospitals NHS Foundation Trust, London, United Kingdom
| | - Syed Ali Moinuddin
- University College London Hospitals NHS Foundation Trust, London, United Kingdom
| | | | | | | | | | | | | | - Geraint Rees
- University College London, London, United Kingdom
| | | | | | | | | | | |
Collapse
|