Abstract

AbstractDiffusion models can generate high‐quality images and have attracted increasing attention. However, diffusion models adopt a progressive optimization process and often have long training and inference time, which limits their application in realistic scenarios. Recently, some latent space diffusion models have partially accelerated training speed by using parameters in the feature space, but additional network structures still require a large amount of unnecessary computation. Therefore, we propose the Contour Wavelet Diffusion method to accelerate the training and inference speed. First, we introduce the contour wavelet transform to extract anisotropic low‐frequency and high‐frequency components from the input image, and achieve acceleration by processing these down‐sampling components. Meanwhile, due to the good reconstructive properties of wavelet transforms, the quality of generated images can be maintained. Second, we propose a Batch‐normalized stochastic attention module that enables the model to effectively focus on important high‐frequency information, further improving the quality of image generation. Finally, we propose a balanced loss function to further improve the convergence speed of the model. Experimental results on several public datasets show that our method can significantly accelerate the training and inference speed of the diffusion model while ensuring the quality of generated images.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call