Abstract

A commonly used technique for improving query response time in retrieval systems is storing query-relevant information in a fast access memory storage such as a result cache. The effectiveness of the result cache heavily relies on giving correct admission and eviction decisions, which require careful analysis of query and stream characteristics. Due to anonymous and global user access patterns, search engines are often considered time-invariant architectures: query characteristics are frequently collected globally, and assumed to be unchanging for long periods of time. However, the highly distributed nature of the modern search engine framework consequently led to noticeable temporal changes in user access patterns through short periods of time for each of the data center within the distributed network.The work presented here attempts to evaluate temporal variations in query submissions and exploit them in order to improve the result caching performance. To this end, query logs are analyzed in order to verify the availability of such variations and a new caching framework that facilitate these changes is proposed. The proposed framework partitions the result cache into three segments: a static segment that stores the most frequent queries in an offline fashion, a newly introduced semi-static segment on top of the state-of-the-art Static–Dynamic Cache (SDC) that changes content during different periods of the day, and a dynamic segment that is maintained in an Least Recently Used (LRU) fashion. Conducted experiments demonstrate that the proposed caching framework improves the hit rate of a search engine result cache up to 3.31% and query response time up to 7.27% with respect to the state-of-the-art techniques.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call