As breast radiotherapy becomes more conformal, precise target delineation is increasingly important. Multiple atlases are available to provide uniformity, but contouring remains resource intensive and prone to variation. Automated contouring has been investigated to reduce planning time and standardize contouring practices. Herein, we investigate a model trained to autocontour RADCOMP breast regional nodal targets using a small, highly curated set of manual contours. The contours for 15 patients previously treated on the RADCOMP study were edited per the RADCOMP atlas guidelines including bilateral supraclavicular region (SCR), posterior triangle, axilla, breast, and internal mammary nodes (IMN). A 3D U-Net architecture was trained on these volumes to autocontour each structure. For validation, 20 new cases were autocontoured with this model. The autocontours were independently scored for accuracy by three physicians on a 5-point modified Likert scale. A score of 3 indicated that there were edits that the reviewer judged as clinically important, but it was more efficient to edit the automatically generated contours than start without an autocontour. A score of 4 indicated differences were stylistic and not clinically significant. A score of 5 indicated that contours were clinically acceptable without stylistic changes. To evaluate time saved, a breast specialist radiation oncologist first edited the autocontours, then manually contoured bilateral target regions of interest on 5 cases. Finally, these edited contours were objectively compared with unedited autocontours for similarity using Dice Similarity Coefficient (DSC) and Mean Surface Distance (MSD). Twenty retrospectively autocontoured cases were evaluated by 3 physicians for clinical appropriateness. Mean Likert scores for each OAR were as follows: L Breast: 3.6, R Breast: 3.4, L Axilla: 4.0, R Axilla: 3.9, L IMN: 3.6, R IMN: 3.6, L Posterior Triangle: 4.0, R Posterior Triangle: 3.9, L SCR: 3.8, and R SCR: 3.8. For the timed portion of the study, the mean time spent editing autocontours for clinical appropriateness was 5 minutes and 43 seconds ± 64.4 seconds, while the mean contouring time when manually contouring was 11 minutes and 36 seconds ± 50.1 seconds (p<0.001; paired t-test). Average DSC and MSD values measuring differences between pre and post clinical edits were 0.99 and 0.3mm, 0.99 and 0.2mm, 0.93 and 0.3mm, 0.97 and 0.2mm and 0.92 and 0.3mm for Breast, Axilla, IMN, SCR, and posterior triangle, respectively. This study demonstrates that a small and carefully curated dataset can train an autocontouring model that is subjectively useful, time efficient, and objectively accurate. Future studies using the RADCOMP Atlas may benefit from autocontouring to standardize treatment or streamline central verification of treatment planning.
Read full abstract