Abstract

Searching the query log of a database system has a variety of applications. In a complex database, relevant queries in the log can serve as an initial example for query formulation, or may elucidate how to query the data in an optimized manner. Searching for queries that may cause a security or a privacy breach could be used to detect leaks of sensitive data. In general, queries in the query log can provide valuable information about how data have been accessed and used. Finding relevant queries requires conducting search over a repository of SQL queries. However, expressing the information need, to specify which queries should be retrieved, is not easy. In this paper we study the approach of search-by-example, where, given an SQL query Q, the goal is to retrieve queries that are similar to Q. We distinguish between two types of search—structural search and intent-driven search. In structural search, queries are considered similar if their textual formulations are similar, i.e., a small number of edit operations transform one query into the other. In intent-driven search, two queries are deemed similar if they were written for the same task. We illustrate these two types of similarity and the differences between them. We present four heuristics for testing query similarity. Two of the methods are exhaustive and two are less accurate and efficient. We explain how to utilize the efficient methods to boost a search using the exhaustive methods. An experimental evaluation and a user study illustrate the effectiveness of the methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call