Context. With increasing amounts of data produced by astronomical surveys, automated analysis methods have become crucial. Synthetic data are required for developing and testing such methods. Current classical approaches to simulations often suffer from insufficient detail or inaccurate representation of source type occurrences. Deep generative modeling has emerged as a novel way of synthesizing realistic image data to overcome those deficiencies. Aims. We implemented a deep generative model trained on observations to generate realistic radio galaxy images with full control over the flux and source morphology. Methods. We used a diffusion model, trained with continuous time steps to reduce sampling time without quality impairments. The two models were trained on two different datasets, respectively. One set was a selection of images obtained from the second data release of the LOFAR Two-Metre Sky Survey (LoTSS). The model was conditioned on peak flux values to preserve signal intensity information after re-scaling image pixel values. The other, smaller set was obtained from the Very Large Array (VLA) survey of Faint Images of the Radio Sky at Twenty-Centimeters (FIRST). In that set, every image was provided with a morphological class label the corresponding model was conditioned on. Conditioned sampling is realized with classifier-free diffusion guidance. We evaluated the quality of generated images by comparing the distributions of different quantities over the real and generated data, including results from the standard source-finding algorithms. The class conditioning was evaluated by training a classifier and comparing its performance on both real and generated data. Results. We have been able to generate realistic images of high quality using 25 sampling steps, which is unprecedented in the field of radio astronomy. The generated images are visually indistinguishable from the training data and the distributions of different image metrics were successfully replicated. The classifier is shown to perform equally well for real and generated images, indicating strong sampling control over morphological source properties.
Read full abstract