Abstract
The task of fashion recommendation includes two main challenges: visual understanding and visual matching. Visual understanding aims to extract effective visual features. Visual matching aims to model a human notion of compatibility to compute a match between fashion items. Most previous studies rely on recommendation loss alone to guide visual understanding and matching. Although the features captured by these methods describe basic characteristics (e.g., color, texture, shape) of the input items, they are not directly related to the visual signals of the output items (to be recommended). This is problematic because the aesthetic characteristics (e.g., style, design), based on which we can directly infer the output items, are lacking. Features are learned under the recommendation loss alone, where the supervision signal is simply whether the given two items are matched or not. To address this problem, we propose a neural co-supervision learning framework, called the FAshion Recommendation Machine (FARM). FARM improves visual understanding by incorporating the supervision of generation loss, which we hypothesize to be able to better encode aesthetic information. FARM enhances visual matching by introducing a novel layer-to-layer matching mechanism to fuse aesthetic information more effectively, and meanwhile avoiding paying too much attention to the generation quality and ignoring the recommendation performance. Extensive experiments on two publicly available datasets show that FARM outperforms state-of-the-art models on outfit recommendation, in terms of AUC and MRR. Detailed analyses of generated and recommended items demonstrate that FARM can encode better features and generate high quality images as references to improve recommendation performance.
Highlights
Fashion recommendation has attracted increasing attention [14, 18, 20] for its potentially wide applications in fashion-oriented online communities such as, e.g., Polyvore1 and Chictopia.2 By recommending fashionable items that people may be interested in, fashion recommendation can promote the development of online retail by stimulating people’s interests and participation in online shopping
Iwata et al [15] define three types of feature, i.e., color, texture and local descriptors such as Scale Invariant Feature Transform (SIFT), and propose a recommendation model based on Graphical Models (GM)
We address the challenges of outfit recommendation from a novel perspective by proposing a neural co-supervision learning framework, called FAshion Recommendation Machine (FARM)
Summary
Fashion recommendation has attracted increasing attention [14, 18, 20] for its potentially wide applications in fashion-oriented online communities such as, e.g., Polyvore and Chictopia. By recommending fashionable items that people may be interested in, fashion recommendation can promote the development of online retail by stimulating people’s interests and participation in online shopping. Visual matching requires modeling a human notion of the compatibility between fashion items [41], which involves matching features such as color and shape etc. Iwata et al [15] define three types of feature, i.e., color, texture and local descriptors such as Scale Invariant Feature Transform (SIFT) (for visual understanding), and propose a recommendation model based on Graphical Models (GM) (for visual matching). Liu et al [29] define five types of feature including Histograms of Oriented Gradient (HOG) [9], Local Binary Pattern (LBP) [1], color moment, color histogram and skin descriptor [5] (for visual understanding), and propose a latent Support Vector Machine (SVM) based recommendation model (for visual matching)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.