<h3>1. Purpose</h3> Auto-segmentation is highly desired in brachytherapy treatment planning since patients' time between implantation and treatment is limited. Minimizing the HRCTV and OARs contouring variability can improve the plan quality consistency. In deep learning-based image segmentation area, lots of novel architectures are proposed in organ segmentation. However, fine-tuning of the hyper-parameters is tedious and time-consuming. This study aims to apply a self-configured ensemble method for fast and reproducible auto-segmentation in gynecological cancer. <h3>2. Methods and Materials</h3> <b>2.1 Patient Data:</b> 237 brachytherapy cases were selected in the retrospective study. 207 patients are used for training and 30 patients for tests. The patients were treated with applicator set among Utrecht Applicators, Ovoid Applicator, Vaginal Multi-Channel Applicator, Fletcher Applicator, and Tandem with up to 10 interstitial needles. The HRCTV and OARs were manually delineated by an experienced radiation oncologist. <b>2.2 Geometric Evaluation:</b> In quantitative evaluation, we used the Dice Similarity Coefficient (DSC), Average Surface Distance(ASD), and 95% Hausdorff distance (HD95%) index. In qualitative evaluation, two radiation oncologists evaluated the auto-segmentation results visually and graded the results using a 4-point Likert scale. <b>2.3 Dosimetric Evaluation:</b> For HRCTV, D<sub>90%</sub>, V<sub>100%</sub>, V<sub>150%</sub>, V<sub>200%</sub> were evaluated. For OARs, D<sub>2cc</sub>, D<sub>1cc</sub>, D<sub>0.1cc</sub>, and D<sub>max</sub> were evaluated. Since the dose distribution, and by extension, dose volume parameters, can vary largely among plans, a customized python program was developed to calculate the dose volume parameters based on predicted contours (P<sub>predicted</sub>) using the dose map of original plan created based on manual contours (P<sub>orginal</sub>). Namely, the P<sub>predicted</sub> and P<sub>orginal</sub> share the same dose map, simulating same applicator position, source dwell position and dwell time. Model performance was quantified by calculating the residual of dose volume parameters between P<sub>predicted</sub> and P<sub>orginal</sub>. <b>2.4 Auto-segmentation Network</b> In this study, nnU-Net was selected to provide a standardized workflow to achieve accurate and reproducible segmentation. The architecture template of nnU-net is a ‘U-Net-like' encode-decoder with skip connection and instance normalization. It provides three architectures based on the U-Net backbone: a 2D U-Net, a 3D U-Net which training all images at full image resolution (3D-Fullres), and a 3D U-Net cascade network (3D-Cascade). Networks are ensembled by averaging softmax probabilities. <b>2.5 Statistical Analysis</b> Cohen's kappa (κ) was calculated to evaluate the interobserver agreement between the two radiation oncologists at qualitative evaluation. <h3>3. Results</h3> <b>3.1</b> Geometric EvaluationIn quantitative evaluation, 3D-Cascade achieves the best performance in the bladder (DSC: 0.936±0.051, HD95%: 3.503±1.956, ASD: 0.944±0.503), rectum (DSC: 0.831±0.074, HD95%: 7.579±5.857, ASD: 3.6±3.485), HRCTV (DSC: 0.836±0.07, HD95%: 7.42±5.023, ASD: 2.094±1.311). The qualitative evaluation shows that in more than 90% of data sets, no or only minor visually detectable qualitative segmentation errors occurred. Good interobserver agreement was achieved on 2D (κ = 0.67), 3D-Fullres (κ=0.69), 3D-Cascade (κ=0.78) and ensemble (κ=0.71) segmentations. <b>3.2 Dosimetric Evaluation</b>The average difference for ΔD90% in HRCTV is less than 7.8%, and for OARs, the average difference in D2cc is smaller than 15%. <h3>4. Conclusion</h3> In this work, we have demonstrated that it is feasible to use a standardized nnU-net method for OARs and HRCTV segmentation in gynecological cancer.
Read full abstract