Reporting L Most Favorite Objects in Uncertain Databases with Probabilistic Reverse Top-k Queries

Guoqing Xiao,Kenli Li,Keqin Li

doi:10.1109/icdmw.2015.47

Abstract

Top-k queries are widely studied for identifying a ranked set of the k most interesting objects based on the individual user preference. Reverse top-k queries are proposed from the perspective of the product manufacturer, which are essential for manufacturers to assess the potential market and impacts of their products. However, the existing approaches for reverse top-k queries are all based on the assumption that the underlying data are exact. Due to the intrinsic differences between uncertain and certain data, these methods are designed only in certain databases and cannot be applied to uncertain case directly. Motivated by this, in this paper, we firstly model the probabilistic reverse top-k queries in the context of uncertain data. Moreover, we formulate the challenging problem of processing queries that report l most favorite objects to users, where impact factor of an object is defined as the cardinality of the probabilistic reverse top-k query result set. For speeding up the query, we exploit several properties of probabilistic threshold top-k queries and probabilistic skyline queries to reduce the solution space of this problem. In addition, an upper bound of the potential users is estimated to reduce the cost of computing the probabilistic reverse top-k queries for the candidate objects. Furthermore, effective pruning heuristics are presented to further reduce the search space of query processing. Finally, efficient query algorithms are presented seamlessly with integration of the proposed pruning strategies. Extensive experiments demonstrate the efficiency and effectiveness of our proposed algorithms with various experimental settings.

Full Text