In the field of healthcare, the acquisition of sample is usually restricted by multiple considerations, including cost, labor- intensive annotation, privacy concerns, and radiation hazards, therefore, synthesizing images-of-interest is an important tool to data augmentation. Diffusion models have recently attained state-of-the-art results in various synthesis tasks, and embedding energy functions has been proved that can effectively guide the pre-trained model to synthesize target samples. However, we notice that current method development and validation are still limited to improving indicators, such as Fréchet Inception Distance score (FID) and Inception Score (IS), and have not provided deeper investigations on downstream tasks, like disease grading and diagnosis. Moreover, existing classifier guidance which can be regarded as a special case of energy function can only has a singular effect on altering the distribution of the synthetic dataset. This may contribute to in-distribution synthetic sample that has limited help to downstream model optimization. All these limitations remind that we still have a long way to go to achieve controllable generation. In this work, we first conducted an analysis on previous guidance as well as its contributions on further applications from the perspective of data distribution. To synthesize samples which can help downstream applications, we then introduce uncertainty guidance in each sampling step and design an uncertainty-guided diffusion models. Extensive experiments on four medical datasets, with ten classic networks trained on the augmented sample sets provided a comprehensive evaluation on the practical contributions of our methodology. Furthermore, we provide a theoretical guarantee for general gradient guidance in diffusion models, which would benefit future research on investigating other forms of measurement guidance for specific generative tasks.
Read full abstract