Abstract

Background : Instant search recommends completions of the query ‘on the fly’, and instantly displays the results with every keystroke. It is desirable that these query results be robust against typographical errors that appear not only in the query but also in the documents. Additionally, instant search requires instant response time and ranking of the results to focus on the most important answers. Method: In this study, simple and efficient methods for instant fuzzy single keyword and multi-keyword search that are resilient to typographical errors and that employ no more than inverted and forward indices are studied. While computing search results incrementally using the cached results, the answers are ranked based on their relevance to the query using probabilistic correlation-based ranking. Findings: Experiments are conducted on data sets DBLP and Medline and the execution time for obtaining answers to instant fuzzy single keyword search is recorded for different prefix lengths. Similarly, the execution time for obtaining answers to instant fuzzy multi-keyword search is recorded for sub-queries of two keywords and three keywords for various prefix lengths on the same data set. Furthermore, in order to measure the usefulness of the proposed correlation-based ranking, precision is calculated for the search results. Experimental evaluation demonstrates the efficacy of the instant fuzzy search algorithms and the probabilistic correlation-based ranking. Applications: The proposed instant fuzzy keyword search for single and multiple keywords not only improves the efficiency but also the quality of the search results. Keywords: Keyword Search, Multi-keyword Search, Fuzzy Search, Probabilistic Correlation

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call