Advanced analysis and join queries in multidimensional spaces

Shen Ge

doi:10.5353/th_b4979933

Abstract

Multidimensional data are ubiquitous and their efficient management and analysis is a core database research problem. There are lots of previous works focusing on indexing, analyzing and querying multidimensional data. In this dissertation, three challenging advanced analysis and join problems in multidimensional spaces are proposed and studied, providing efficient solutions to their related applications. First, the problem of generalized budget constrained optimization query (Gen-BOQ) is studied. In real life, it is often difficult for manufacturers to create new products dominating their competitors, due to some constraints. These constraints can be modeled by constraint functions, and the problem is then to decide the best possible regions in multidimensional spaces where the features of new products could be placed. Using the number of dominating and dominated objects, the profitability of these regions can be evaluated and the best areas are then returned. Although GenBOQ computation is challenging due to its high complexity, an efficient divide-and-conquer based framework is offered for this problem. In addition, an approximation method is proposed, making tradeoffs between the result quality and the query cost. Next, the efficient evaluation of all top-k queries (ATOPk) in multidimensional spaces is investigated, which compute the top ranked objects for a group of preference functions simultaneously. As an application of such a query, consider an online store, which needs to provide recommendations for a large number of users simultaneously. This problem is somewhat overlooked by past research; in this thesis, batch algorithms are proposed instead of naively evaluating top-k queries individually. Similar preferences are grouped together, and two algorithms are proposed, using block indexed nested loops and a view-based thresholding strategy. The optimized view-based threshold algorithm is demonstrated to be consistently the best. Moreover, an all top-k query helps to evaluate other queries relying on the results of multiple top-k queries, such as reverse top-k queries and top-m influential queries proposed in previous works. It is shown that applying the view-based approach to these queries can improve the performance of the current state-of-the-art by orders of magnitude. Finally, the problem of spatio-textual similarity joins (ST-SJOIN) on multidimensional data is considered. Given both spatial and textual information, ST-SJOIN retrieves pairs of objects which are both spatially close and textually similar. One possible application of this query is friendship recommendation, by matching people who not only live nearby but also share common interests. By combining the state-of-the-art strategies of spatial distance joins and set similarity joins, efficient query processing algorithms are proposed, taking both spatial and textual constraints into account. A batch processing strategy is also introduced to boost the performance, which is also effective for the original textual-only joins. Using synthetic and real datasets, it is shown that the proposed techniques outperform the baseline solutions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Advanced analysis and join queries in multidimensional spaces

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Reverse top-k queries
Akrivi Vlachou ... Christos Doulkeridis
-
Akrivi Vlachou, et. al.Akrivi Vlachou ... Christos Doulkeridis
01 Mar 2010
01 Mar 2010

Branch-and-bound algorithm for reverse top-k queries
Akrivi Vlachou ... Christos Doulkeridis
-
Akrivi Vlachou, et. al.Akrivi Vlachou ... Christos Doulkeridis
22 Jun 2013
22 Jun 2013

Monochromatic and Bichromatic Reverse Top-k Queries
Akrivi Vlachou ... Kjetil Norvag
IEEE Transactions on Knowledge and Data Engineering | VOL. 23
Akrivi Vlachou, et. al.Akrivi Vlachou ... Kjetil Norvag
01 Aug 2011
IEEE Transactions on Knowledge and Data Engineering | VOL. 23

Reporting L Most Favorite Objects in Uncertain Databases with Probabilistic Reverse Top-k Queries
Guoqing Xiao ... Kenli Li
-
Guoqing Xiao, et. al.Guoqing Xiao ... Kenli Li
01 Nov 2015
01 Nov 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Advanced analysis and join queries in multidimensional spaces

Abstract

Talk to us

Similar Papers