Background and purposeThis study aimed at training and validating a multi-institutional deep learning (DL) auto segmentation model for nodal clinical target volume (CTVn) in high-risk breast cancer (BC) patients with both training and validation dataset created with multi-institutional participation, with the overall aim of national clinical implementation in Denmark. Materials and methodsA gold standard (GS) dataset and a high-quality training dataset were created by 21 BC delineation experts from all radiotherapy centres in Denmark. The delineations were created according to ESTRO consensus delineation guidelines. Four models were trained: One per laterality and extension of CTVn internal mammary nodes. The DL models were tested quantitatively in their own test-set and in relation to interobserver variation (IOV) in the GS dataset with geometrical metrics, such as the Dice Similarity Coefficient (DSC). A blinded qualitative evaluation was conducted with a national board, presented to both DL and manual delineations. ResultsA median DSC > 0.7 was found for all, except the CTVn interpectoral node in one of the models. In the qualitative evaluation ‘no corrections needed’ were acquired for 297 (36 %) in the DL structures and 286 (34 %) for manual delineations. A higher rate of ‘major corrections’ and ‘easier to start from scratch’ was found in the manual delineations. The models performed within the IOV of an expert group, with two exceptions. ConclusionDL models were developed on a national consensus cohort and performed on par with the IOV between BC experts and had a comparable or higher clinical acceptance than expert manual delineations.
Read full abstract