Abstract Generative diffusion models have recently emerged as a leading approach for generating high-dimensional data. In this paper, we show that the dynamics of these models exhibit a spontaneous symmetry breaking phenomenon that divides the generative dynamics into two distinct phases: (1) a linear steady-state dynamics around a central fixed-point and, (2) an attractor dynamics directed towards the data manifold. These two ‘phases’ are separated by the change in stability of the central fixed-point, with the resulting window of instability being responsible for the diversity of the generated samples. Using both theoretical and empirical evidence, we show that an accurate simulation of the early dynamics does not significantly contribute to the final generation, since early fluctuations are reverted to the central fixed point. To leverage this insight, we propose a Gaussian late initialization scheme, which significantly improves model performance, achieving up to 3× Fréchet inception distance improvements on fast samplers, while also increasing sample diversity (e.g. racial composition of generated CelebA images). Our work offers a new way to understand the generative dynamics of diffusion models that has the potential to bring about higher performance and less biased fast-samplers.
Read full abstract