Learning Attribute-guided Fashion Similarity with Spatial and Channel Attention

Bofeng Zhang,Kang Yan,Yongquan Wan,Cairong Yan

doi:10.1080/0952813x.2022.2104386

Abstract

ABSTRACT Fashion image retrieval is one of the important services of e-commerce platforms, and it is also the basis of various fashion-related AI applications. Studies have shown that in a multi-modal environment (images + attribute labels), embedding items into specific attribute spaces can support more fine-grained similarity measures, which is especially suitable for fashion retrieval tasks. In this paper, we propose an attention-based attribute-guided similarity learning network (AttnFashion) for fashion image retrieval. The core of this network is an attribute-guided spatial attention module and an attribute-guided channel attention module, which correspond to the mapping between attributes and image regions, and the mapping between attributes and high-level image semantics, respectively. To make these two modules interact deeply, we design a parallel structure that allows them to share attribute embeddings and guide each other to extract specific features, which also helps to reduce the network parameters of the attention modules. An adaptive feature fusion strategy is proposed to synthesise the features extracted by the two modules. Extensive experiments show that the proposed AttnFashion performs better than current competitive networks in the field of fine-grained attribute-based fashion retrieval.

Full Text