A Family of Rank Similarity Measures Based on Maximized Effectiveness Difference

Luchen Tan,Charles L.A Clarke

doi:10.1109/tkde.2015.2448541

Abstract

Rank similarity measures provide a method for quantifying differences between search engine results without the need for relevance judgments. For example, the providers of a search service might use such measures to estimate the impact of a proposed algorithmic change across a large number of queries—perhaps millions—identifying those queries where the impact is greatest. In this paper, we propose and validate a family of rank similarity measures, each derived from an associated effectiveness measure. Each member of the family is based on the maximization of effectiveness difference under this associated measure. Computing this maximized effectiveness difference (MED) requires the solution of an optimization problem that varies in difficulty, depending on the associated measure. We present solutions for several standard effectiveness measures, including nDCG, AP, and ERR. Through an experimental validation, we show that MED reveals meaningful differences between retrieval runs. Mathematically, MED is a metric, regardless of the associated measure. Prior work has established a number of other desiderata for rank similarity in the context of search, and we demonstrate that MED satisfies these requirements. Unlike previous proposals, MED allows us to directly translate assumptions about user behavior from any established effectiveness measure to create a corresponding rank similarity measure. In addition, MED cleanly accommodates partial relevance judgments, and if complete relevance information is available, it reduces to a simple difference between effectiveness values.

Full Text