Relative attribute (RA) learning aims to learn the ranking function describing the relative strength of the attribute. Most of current learning approaches learn a linear ranking function for each attribute by use of the hand-crafted visual features. Different from the existing study, in this paper, we propose a novel deep relative attributes (DRA) algorithm to learn visual features and the effective nonlinear ranking function to describe the RA of image pairs in a unified framework. Here, visual features and the ranking function are learned jointly, and they can benefit each other. The proposed DRA model is comprised of five convolutional neural layers, five fully connected layers, and a relative loss function which contains the contrastive constraint and the similar constraint corresponding to the ordered image pairs and the unordered image pairs, respectively. To train the DRA model effectively, we make use of the transferred knowledge from the large scale visual recognition on ImageNet [1] to the RA learning task. We evaluate the proposed DRA model on three widely used datasets. Extensive experimental results demonstrate that the proposed DRA model consistently and significantly outperforms the state-of-the-art RA learning methods. On the public OSR, PubFig, and Shoes datasets, compared with the previous RA learning results [2] , the average ranking accuracies have been significantly improved by about $8\%$ , $9\%$ , and $14\%$ , respectively.
Read full abstract