Abstract
BACKGROUND/PURPOSE
Delineation of the lymph node levels of the neck for irradiation of the elective clinical target volume in head and neck cancer (HNC) patients is time consuming and prone to interobserver variability (IOV), although international consensus guidelines exist. The aim of this study was to develop and validate a 3D convolutional neural network (CNN) for semi-automated delineation of all nodal neck levels, focussing on delineation accuracy, efficiency and consistency compared to manual delineation.
MATERIAL/METHODS
The CNN was trained on a clinical dataset of 69 HNC patients. For validation, 17 lymph node levels were manually delineated in 16 new patients by two observers, independently, using international consensus guidelines. Automated delineations were generated by applying the CNN and were subsequently corrected by both observers separately as needed for clinical acceptance. Both delineations were performed two weeks apart and blinded to each other. IOV was quantified using Dice similarity coefficient (DSC), mean surface distance (MSD) and Hausdorff distance (HD). To assess automated delineation accuracy, agreement between automated and corrected delineations were evaluated using the same measures. To assess efficiency, the time taken for manual and corrected delineations were compared. In a second step, only the clinically relevant neck levels were selected and delineated, once again manually and by applying and correcting the network.
RESULTS
When all lymph node levels were delineated, time taken for correcting automated delineations compared to manual delineations was significantly shorter for both observers (mean: 35 vs 52 min, p < 10-5). Based on DSC, automated delineation agreed best with corrected delineation for lymph node levels Ib, II-IVa, VIa, VIb, VIIa, VIIb (DSC >85%). Manual corrections necessary for clinical acceptance were 1.4 mm MSD on average and were especially low (<1mm) for levels II-IVa, VIa, VIIa and VIIb. IOV was significantly smaller with automated compared to manual delineations (MSD: 1.4 mm vs 2.5 mm, p < 10-11). When delineating only the clinically relevant neck levels, the correction time was also significantly shorter (mean: 8 vs 15 min, p < 10-5). Based on DSC, automated delineation agreed very well with corrected delineation (DSC > 87%). Manual corrections necessary for clinical acceptance were 1.3 mm MSD on average. IOV was significantly smaller with automated compared to manual delineations (MSD: 0.8 mm vs 2.3 mm, p < 10-3).
CONCLUSION
The CNN developed for automated delineation of the elective lymph node levels in the neck in HNC was shown to be more efficient and consistent compared to manual delineation, which justifies its implementation in clinical practice.
Collapse