Abstract

Keyword based queries are inherently ambiguous such that given a set of keywords the database search engine has only an uncertain guess about the user's informational need represented by the query. Possibly high complexity of the data makes providing intelligent search results effectively extremely challenging. Databases enable users to precisely express their informational needs using structured queries. However, database query construction is a laborious and error-prone process, which cannot be performed well by most end users. Keyword search alleviates the usability problem at the price of query expressiveness. As keyword search algorithms do not differentiate between the possible informational needs represented by a keyword query, users may not receive adequate results. This paper presents Extended Incremental Query Processing - a novel approach to bridge the gap between usability of keyword search and expressiveness of database queries. Extended Incremental Query Processing enables a user to start with an arbitrary keyword query and incrementally refine it into a structured query through an interactive interface. The enabling techniques of Extended Incremental Query Processing include: 1) A probabilistic framework for incremental query construction; 2) A probabilistic model to assess the possible informational needs represented by a keyword query; 3) An algorithm to obtain the optimal query construction process. This paper presents the detailed design of Extended Incremental Query Processing, and demonstrates its effectiveness and scalability through experiments over real-world data and a user study. Extracting information from semi structured documents is a very hard task. Documents are often so large that the data set returned as answer to a query may be too big to convey interpretable knowledge. In this, we describe an approach based on Tree-Based Association Rules (TARs): mined rules, which provide approximate, intentional information on both the structure and contents of XML documents. This mined knowledge is later used to provide: a concise idea—the gist—of both the structure and the content of the XML document .quick, approximate answers to queries.

Highlights

  • With the growth of structured information available on the Web and in online databases, it becomes increasingly difficult for users to find the exact data they seek for

  • Even a theoretically optimal ranking algorithm can, at best, rank the most common query interpretations highest, so the users with less frequent informational needs may not receive adequate results. To overcome these limitations we introduced a new system EIQP designed to fill the gap between usability of keyword search and expressiveness of database queries

  • 2) A probabilistic model to assess the possible informational needs represented by a keyword query

Read more

Summary

Introduction

With the growth of structured information available on the Web and in online databases, it becomes increasingly difficult for users to find the exact data they seek for. Even a theoretically optimal ranking algorithm can, at best, rank the most common query interpretations highest, so the users with less frequent informational needs may not receive adequate results. To overcome these limitations we introduced a new system EIQP designed to fill the gap between usability of keyword search and expressiveness of database queries. As for query-answering, since query languages for semi structured data rely the on document structure to convey its semantics, in order for query formulation to be effective users need to know this structure in advance, which is often not the case This limitation is a crucial problem which did not emerge in the context of relational database management systems. As a consequence, when accessing for the first time a large dataset, gaining some general information about its main structural and semantic characteristics helps investigation on more specific details

Keyword Search in Databases
Data Modeling
Structural Ambiguity of the Results
Ranking Strategies
Faceted search in Databases
Incremental Query Construction
Techniques
Disadvantages of Existing System
Implementation Results
Get the gist
Get the answers
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call