Arbitrary style transfer (AST) has garnered considerable attention for its ability to transfer styles infinitely. Although existing methods have achieved impressive results, they may overlook style consistencies and fail to capture crucial style patterns, leading to inconsistent style transfer (ST) caused by minor disturbances. To tackle this issue, we conduct a mathematical analysis of inconsistent ST and develop a style inconsistency measure (SIM) to quantify the inconsistencies between generated images. Moreover, we propose a consistent AST (CAST) framework that effectively captures and transfers essential style features into content images. The proposed CAST framework incorporates an intersection-of-union-preserving crop (IoUPC) module to obtain style pairs with minor disturbance, a self-attention (SA) module to learn the crucial style features, and a style inconsistency loss regularization (SILR) to facilitate consistent feature learning for consistent stylization. Our proposed framework not only provides an optimal solution for consistent ST but also outperforms existing methods when embedded into the CAST framework. Extensive experiments demonstrate that the proposed CAST framework can effectively transfer style patterns while preserving consistency and achieve the state-of-the-art performance.
Read full abstract