Abstract

Interactive data exploration platforms in Web, business and scientific domains are becoming increasingly popular. Typically, users without prior knowledge of data interact with these platforms in an exploratory manner hoping they might retrieve the results they are looking for. One way to explore large-volume data is by posing aggregate queries which group values of multiple rows by an aggregate operator to form a single value: an aggregated value. Though, when a query fails, i.e., returns undesired aggregated value, users will have to undertake a frustrating trial-and-error process to refine their queries, until a desired result is attained. This data exploration process, however, is growing rather difficult as the underlying data is typically of large-volume and high-dimensionality. While heuristic-based techniques are fairly successful in generating refined queries that meet specified requirements on the aggregated values, they are rather oblivious to the (dis)similarity between the input query and its corresponding refined version. Meanwhile, enforcing a similarity-aware query refinement is rather a non-trivial challenge, as it requires a careful examination of the query space while maintaining a low processing cost. To address this challenge, we propose an innovative scheme for efficient Similarity-Aware Refinement of Aggregation Queries called (EAGER) which aims to balance the tradeoff between satisfying the aggregate and similarity constraints imposed on the refined query to maximize its overall benefit to the user. To achieve that goal, EAGER implements efficient strategies to minimize the costs incurred in exploring the available search space by utilizing similarity-based and monotonic-based pruning techniques to bound the search space and quickly find a refined query that meets users' expectations. Our extensive experiments show the scalability exhibited by EAGER under various workload settings, and the significant benefits it provides.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.