Improved outcome models with denoising diffusion

D Dudas,T.J Dilling,I El Naqa

doi:10.1016/j.ejmp.2024.103307

Abstract

PurposeRadiotherapy outcome modelling often suffers from class imbalance in the modelled endpoints. One of the main options to address this issue is by introducing new synthetically generated datapoints, using generative models, such as Denoising Diffusion Probabilistic Models (DDPM). In this study, we implemented DDPM to improve performance of a tumor local control model, trained on imbalanced dataset, and compare this approach with other common techniques. MethodsA dataset of 535 NSCLC patients treated with SBRT (50 Gy/5 fractions) was used to train a deep learning outcome model for tumor local control prediction. The dataset included complete treatment planning data (planning CT images, 3D planning dose distribution and patient demographics) with sparsely distributed endpoints (6–7 % experiencing local failure). Consequently, we trained a novel conditional 3D DDPM model to generate synthetic treatment planning data. Synthetically generated treatment planning datapoints were used to supplement the real training dataset and the improvement in the model’s performance was studied. Obtained results were also compared to other common techniques for class imbalanced training, such as Oversampling, Undersampling, Augmentation, Class Weights, SMOTE and ADASYN. ResultsSynthetic DDPM-generated data were visually trustworthy, with Fréchet inception distance (FID) below 50. Extending the training dataset with the synthetic data improved the model’s performance by more than 10%, while other techniques exhibited only about 4% improvement. ConclusionsDDPM introduces a novel approach to class-imbalanced outcome modelling problems. The model generates realistic synthetic radiotherapy planning data, with a strong potential to increase performance and robustness of outcome models.

Full Text