Abstract

There exist a large number of distance functions that allow one to measure similarity between feature vectors and thus can be used for ranking purposes. When multiple representations of the same object are available, distances in each representation space may be combined to produce a single similarity score. In this paper, we present a method to build such a similarity ranking out of a family of distance functions. Unlike other approaches that aim to select the best distance function for a particular context, we use several distances and combine them in a convenient way. To this end, we adopt a classical similarity learning approach and face the problem as a standard supervised machine learning task. As in most similarity learning settings, the training data are composed of a set of pairs of objects that have been labeled as similar/dissimilar. These are first used as an input to a transformation function that computes new feature vectors for each pair by using a family of distance functions in each of the available representation spaces. Then, this information is used to learn a classifier. The approach has been tested using three different repositories. Results show that the proposed method outperforms other alternative approaches in high-dimensional spaces and highlight the benefits of using multiple distances in each representation space.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call