Candidate Tuples Research Articles

We investigate the problem of learning join queries from user examples. The user is presented with a set of candidate tuples and is asked to label them as positive or negative examples, depending on whether or not she would like the tuples as part of the join result. The goal is to quickly infer an arbitrary n -ary join predicate across an arbitrary number m of relations while keeping the number of user interactions as minimal as possible. We assume no prior knowledge of the integrity constraints across the involved relations. Inferring the join predicate across multiple relations when the referential constraints are unknown may occur in several applications, such as data integration, reverse engineering of database queries, and schema inference. In such scenarios, the number of tuples involved in the join is typically large. We introduce a set of strategies that let us inspect the search space and aggressively prune what we call uninformative tuples, and we directly present to the user the informative ones—that is, those that allow the user to quickly find the goal query she has in mind. In this article, we focus on the inference of joins with equality predicates and also allow disjunctive join predicates and projection in the queries. We precisely characterize the frontier between tractability and intractability for the following problems of interest in these settings: consistency checking, learnability, and deciding the informativeness of a tuple. Next, we propose several strategies for presenting tuples to the user in a given order that allows minimization of the number of interactions. We show the efficiency of our approach through an experimental study on both benchmark and synthetic datasets.

Read full abstract

Recently, advanced multimedia applications, such as geographic information system, and content-based multimedia retrieval system, require the efficient processing of k-nearest neighbor queries over large collection of multimedia objects. These queries usually include the semantic information that is represented by text, as well as the visual information that is represented by a high-dimensional feature vector. Among the available techniques for processing such queries, the incremental nearest neighbor algorithm proposed by Hjaltason and Samet is known as the best choice. However, the R-tree used in their algorithm has no facility capable of partially pruning the candidate tuples that will turn out not to satisfy the semantic predicate. Also, the R-tree does not perform sufficiently well on high-dimensional data even though it provides good results on low or middle-dimensional data. These drawbacks may lead to a poor performance when processing the query. In this paper, we propose an integrated index structure, so-called SPY-TEC+, that provides an efficient method for indexing the visual and semantic feature at the same time using the SPY-TEC that was proposed for indexing high-dimensional data, and the signature file. We also propose an efficient incremental nearest neighbor algorithm for processing k-nearest neighbor queries with visual and semantic predicates on the SPY-TEC+. Finally, we show that the SPY-TEC+ enhances the performance of the SPY-TEC for processing k-nearest neighbor queries with visual and semantic predicates through various experiments.

Read full abstract

Candidate Tuples Research Articles

Related Topics

Articles published on Candidate Tuples

Efficient computation of G-Skyline groups on massive data

Ranking the big sky: efficient top-k skyline computation on massive data

EnAli: entity alignment across multiple heterogeneous data sources

Evaluating Top-N queries in n-dimensional normed spaces

Learning Join Queries from User Examples

A Locality Sensitive Hashing Filter for Encrypted Vector Databases

TDEP: efficiently processing top-k dominating query on massive data

Processing Top-<i>N</i> Queries Based on <i>p</i>-Norm Distances

Audio Retrieval Based on Chinese Keyword Search in Relational Databases

On contextual ranking queries in databases

TJJE: An efficient algorithm for top-k join on massive data

Indexing dataspaces with partitions

Chinese Keyword Search by Indexing in Relational Databases

SPY-TEC+ : AN INTEGRATED INDEX STRUCTURE FOR k-NEAREST NEIGHBOR QUERIES WITH SEMANTIC PREDICATES IN MULTIMEDIA DATABASE

Supporting early pruning in top- k query processing on massive data

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Candidate Tuples Research Articles

Related Topics

Articles published on Candidate Tuples

Efficient computation of G-Skyline groups on massive data

Ranking the big sky: efficient top-k skyline computation on massive data

EnAli: entity alignment across multiple heterogeneous data sources

Evaluating Top-N queries in n-dimensional normed spaces

Learning Join Queries from User Examples

A Locality Sensitive Hashing Filter for Encrypted Vector Databases

TDEP: efficiently processing top-k dominating query on massive data

Processing Top-&lt;i&gt;N&lt;/i&gt; Queries Based on &lt;i&gt;p&lt;/i&gt;-Norm Distances

Audio Retrieval Based on Chinese Keyword Search in Relational Databases

On contextual ranking queries in databases

TJJE: An efficient algorithm for top-k join on massive data

Indexing dataspaces with partitions

Chinese Keyword Search by Indexing in Relational Databases

SPY-TEC+ : AN INTEGRATED INDEX STRUCTURE FOR k-NEAREST NEIGHBOR QUERIES WITH SEMANTIC PREDICATES IN MULTIMEDIA DATABASE

Supporting early pruning in top- k query processing on massive data

Processing Top-<i>N</i> Queries Based on <i>p</i>-Norm Distances