Abstract

Query result diversification is a bi-criteria optimization problem for ranking query results. Given a databaseD, a queryQand a positive integerk, it is to find a set ofktuples fromQ(D)such that the tuples are as relevant as possible to the query, and at the same time, as diverse as possible to each other. Subsets ofQ(D)are ranked by an objective function defined in terms of relevance and diversity. Query result diversification has found a variety of applications in databases, information retrieval and operations research.This paper studies the complexity of result diversification for relational queries. We identify three problems in connection with query result diversification, to determine whether there exists a set ofktuples that is ranked above a bound with respect to relevance and diversity, to assess the rank of a givenk-element set, and to count how manyk-element sets are ranked above a given bound. We study these problems for a variety of query languages and for three objective functions. We establish the upper and lower bounds of these problems, all matching, for both combined complexity and data complexity. We also investigate several special settings of these problems, identifying tractable cases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call