Abstract

Keyword search is a familiar and potentially effective way to find information of interest that is locked inside relational databases. Current work has generally assumed that answers for a keyword query reside within a single database. Many practical settings, however, require that we combine tuples from multiple databases to obtain the desired answers. Such databases are often autonomous and heterogeneous in their schemas and data. This paper describes Kite, a solution to the keyword-search problem over heterogeneous relational databases. Kite combines schema matching and structure discovery techniques to find approximate foreign-key joins across heterogeneous databases. Such joins are critical for producing query results that span multiple databases and relations. Kite then exploits the joins - discovered automatically across the databases - to enable fast and effective querying over the distributed data. Our extensive experiments over real-world data sets show that (1) our query processing algorithms are efficient and (2) our approach manages to produce high-quality query results spanning multiple heterogeneous databases, with no need for human reconciliation of the different databases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call