Joint clothes image detection and search via anchor free framework

Mingbo Zhao,Shanchuan Gao,Jianghong Ma,Zhao Zhang

doi:10.1016/j.neunet.2022.08.011

Abstract

Clothes image search is an important learning task in fashion analysis to find the most relevant clothes in a database given a user-provided query. To address this problem, most existing methods employ a two-step approach, i.e., first detect the target clothes, and then crop it to feed the model for similarity learning. But the two-step approach is time-consuming and resource-intensive. On the other hand, one-step methods provide efficient solutions to integrate clothes detection and search in a unified framework. However, since one-step methods usually explore anchor-based detectors, they inevitably inherit limitations, such as high computational complexity caused by dense anchors, and high sensitivity to hyperparameters. To address the aforementioned issues, we propose an anchor-free framework for joint clothes detection and search. Specifically, we first choose an anchor-free detector as backbone. We then add a mask prediction branch and a Re-ID embedding branch to the framework. The mask prediction branch aims to predict the masks of clothes, while Re-ID embedding branch aims to extract the rich embedding features of clothes, in which we aggregate the feature of clothes via a mask pooling module by referencing the estimated target clothes masks. In this way, the extracted target clothes features can grasp more information in the area of the clothes mask; finally, we further introduce a match loss to fine-tune the embedding feature in Re-ID branch for improving the retrieval performance. Simulation results based on real datasets demonstrate the effectiveness of the proposed work.

Full Text