Generative artificial intelligence (GenAI) has been advancing with many notable achievements like ChatGPT and Bard. The deep generative model (DGM) is a branch of GenAI, which is preeminent in generating raster data such as image and sound due to the strong role of deep neural networks (DNNs) in inference and recognition. The built-in inference mechanism of DNN, which simulates and aims at synaptic plasticity of the human neuron network, fosters the generation ability of DGM, which produces surprising results with the support of statistical flexibility. Two popular approaches in DGM are the variational autoencoder (VAE) and generative adversarial network (GAN). Both VAE and GAN have their own strong points although they share and imply the underlying theory of statistics as well as significant complex via hidden layers of DNN when DNN becomes effective encoding/decoding functions without concrete specifications. This research unifies VAE and GAN into a consistent and consolidated model called the adversarial variational autoencoder (AVA) in which the VAE and GAN complement each other; for instance, the VAE is a good data generator by encoding data via the excellent ideology of Kullback–Leibler divergence and the GAN is a significantly important method to assess the reliability of data as to whether it is real or fake. In other words, the AVA aims to improve the accuracy of generative models; besides, the AVA extends the function of simple generative models. In methodology, this research focuses on the combination of applied mathematical concepts and skillful techniques of computer programming in order to implement and solve complicated problems as simply as possible.
Read full abstract