Optimal transport-based unsupervised semantic disentanglement: A novel approach for efficient image editing in GANs

Yunqi Liu,Xue Ouyang,Tian Jiang,Hongwei Ding,Xiaohui Cui

doi:10.1016/j.displa.2023.102560

Abstract

The latent space of pre-trained generative adversarial networks (GANs) is rich in semantic information, which often becomes highly entangled. It is crucial to identify semantic directions within this latent space, as these directions correlate with image attributes and are vital for image editing tasks. Existing methods for semantic discovery usually involve labor-intensive procedures such as manual labeling and training attribute classifiers, which limits their practicality. In response to this issue, the paper proposes the Optimal Transport-based Unsupervised Semantic Disentanglement (OTUSD) algorithm. This novel method efficiently uncovers semantic directions in the latent space of GANs by utilizing the concepts of manifold learning and optimal transport (OT) theory. OTUSD applies singular value decomposition (SVD) to the OT matrix that links latent codes to generated images. This process yields singular vectors that correspond to semantically meaningful directions. Unlike traditional methods, OTUSD bypasses the need for time-consuming labeling and training processes, thus enhancing efficiency and revealing a wider array of semantically meaningful directions. Experimental results demonstrate the effectiveness of OTUSD in discovering semantic directions from several state-of-the-art GAN models, including StyleGAN, StyleGAN2, and BigGAN. This performance emphasizes the potential applicability of OTUSD to image editing and other related tasks, and illuminates its value in harnessing the manifold learning and OT mapping capabilities inherent in GANs for semantic disentanglement. The implementation code is available at https://github.com/LuckAlex/OTUSD.

Full Text