DiffTransfer: A Person Portrait Style Transfer Method Based on Stable Diffusion

Jizheng Wang

doi:10.54097/4113h368

Abstract

With the widespread adoption of high-performance GPUs, NPUs, and other devices in recent years, the threshold for deploying large models on the edge has gradually lowered. Meanwhile, as social networks continue to thrive and virtual reality technology advances rapidly, there is an urgent demand for individuals to possess a realistic virtual identity to meet the ever-evolving social needs. In this context, portrait style transfer has attracted growing research interest. Different from the previous works that mainly follow the pipeline of generative adversarial network or variational autoencoder, this research proposes a method for person portrait style transfer based on the stable diffusion model. By pre-training underlying models, users only need to upload a small number of photos to generate corresponding LORA models for individuals. The approach involves style fusion using textual prompts and other models to modify portrait styles and generate images of individuals in different poses. Extensive visualization results validate the effectiveness of this work, which is able to generate realistic portrait images of people with different styles.

Full Text