Abstract

Surprising performance has been achieved in style transfer since deep learning was introduced to it. However, the existing state-of-the-art (SOTA) algorithms either suffer from quality issues or high computational complexity. The quality issues include shape retention and the adequacy of style migration, and the computational complexity is reflected in the network complexity and additional updates when the style changes. To deal with the above problems, we propose a novel low computational complexity arbitrary style transfer algorithm (LCCStyle) that mainly consists of a transformation feature module (TFM) and learning transformation module (LTM). The TFM is responsible for transforming the content feature map into the stylized feature map without impact on the integrity of content information, which contributes to good shape retention and full style migration. In addition, to avoid additional updates when the style changes, we propose a new training mechanism for arbitrary style transfer to directly generate the parameters of the TFM by a hyper-network. However, the widely used hyper-networks are composed of fully connected layers, which cause a large number of parameters. Hence, we designed a hyper-network (LTM) consisting of one-dimensional convolution to adapt to the characteristics of the Gram matrix of the style feature map, contributing to a small model size and having no impact on quality. Quantitative comparison and user study show that LCCStyle achieves high performance both on the adequacy of style migration and shape retention. Furthermore, compared with the SOTAs, the size of the proposed model is reduced by a large margin of nearly 51.4% <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\sim$</tex-math></inline-formula> 99.6%. When the input is 512×512 pixels, the processing speeds in the cases of unchanged style and constantly changing style are increased by at least 135% and 227%, respectively. On an Nvidia TITAN RTX GPU, LCCStyle reaches 60fps for 720p video and takes only 1 s to process 8 K images. <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/HuangYujie94/LCCStyle</uri> .

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call