Scalable preference queries for high-dimensional data using map-reduce

Gheorghi Guzun,Joel E Tosado,Guadalupe Canahuate

doi:10.1109/bigdata.2015.7364013

Abstract

Preference (top-k) queries play a key role in modern data analytics tasks. Top-k techniques rely on ranking functions in order to determine an overall score for each of the objects across all the relevant attributes being examined. This ranking function is provided by the user at query time, or generated for a particular user by a personalized search engine which prevents the pre-computation of the global scores. Executing this type of queries is particularly challenging for high-dimensional data. Recently, bit-sliced indices (BSI) were proposed to answer these preference queries efficiently in a non-distributed environment for data with hundreds of dimensions. As MapReduce and key-value stores proliferate as the preferred methods for analyzing big data, we set up to evaluate the performance of BSI in a distributed environment, in terms of index size, network traffic, and execution time of preference (top-k) queries, over data with thousands of dimensions. Indexing is implemented on top of Apache Spark for both column and row stores and shown to outperform Hive when running on Map-reduce, and Tez for top-k (preference) queries.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Scalable preference queries for high-dimensional data using map-reduce

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A Two-Phase MapReduce Algorithm for Scalable Preference Queries over High-Dimensional Data
Gheorghi Guzun ... Guadalupe Canahuate
-
Gheorghi Guzun, et. al.Gheorghi Guzun ... Guadalupe Canahuate
01 Jan 2015
01 Jan 2015

Multidimensional Preference Query Optimization on Infrastructure Monitoring Systems
Yinghua Qin ... Gheorghi Guzun
-
Yinghua Qin, et. al.Yinghua Qin ... Gheorghi Guzun
01 Dec 2019
01 Dec 2019

Slicing the Dimensionality: Top-k Query Processing for High-Dimensional Spaces
Gheorghi Guzun ... Joel Tosado
-
Gheorghi Guzun, et. al.Gheorghi Guzun ... Joel Tosado
01 Jan 2014
01 Jan 2014

A New Top-k Conditional XML Preference Queries
Shaikhah Alhazmi ... Mourad Ykhlef
International Journal of Artificial Intelligence & Applications | VOL. 5
Shaikhah Alhazmi, et. al.Shaikhah Alhazmi ... Mourad Ykhlef
30 Sep 2014
International Journal of Artificial Intelligence & Applications | VOL. 5

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Scalable preference queries for high-dimensional data using map-reduce

Abstract

Talk to us

Similar Papers