Abstract
Caching is an effective optimization in search engine. The data selection policy plays a key role in caching, which places the data to be cached in memory. However, the current data selection policies are not suitable to the hybrid storage architecture with solid state disks (SSDs), which have gradually replaced hard disk drives (HDDs) in search engines. In this paper, we present an Efficient Data Selection policy (EDS) for search engine cache management, which views cache media as a knapsack, and views results and posting lists as items. The best benefit can be computed by greedy algorithms. In order to verify the effectiveness, we carry out a series of experiments to study essential factors of data selection in different architectures, including HDD, SSD, and SSD-based hybrid storage architecture, which uses SSD as a secondary cache for memory. The experimental results demonstrate that the proposed policy improves the hit ratio by 20.04% and the retrieval performance on HDD, SSD, and hybrid architecture by 31.98%, 28.72% and 23.24%, respectively.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.