Abstract

SummaryMultitiered storage systems, which are made up of heterogeneous devices, are widely used in distributed environments to accelerate the I/O performance of upper big data applications. It raises new challenges in efficient data migration through smart caching mechanisms among heterogeneous storage levels, such as MEM‐SSD‐HDD. To optimize the cache policy scheduling mechanism on the distributed tiered storage architecture, we proposed a general framework with five layers, including a tiered storage system layer, a cache migration policy layer, a cache policy adaptive scheduling layer, a data access pattern layer, and a big data application layer. The framework prototype has been designed and implemented on the widely used distributed hybrid storage system named Alluxio. To meet the demands of the big data application layer, on the one hand, we designed a couple of cache eviction policies and promotion policies covering various access patterns on the cache migration policy layer (several proposed eviction policies have been adopted by the Alluxio open‐source community). On the other hand, two adaptive cache policy scheduling algorithms for selecting appropriate policies in various scenarios are designed and implemented on the cache policy adaptive scheduling layer. The scheduling algorithms are designed based on the hit ratio statistics and data access pattern model prediction, respectively. Experimental results show that the proposed cache policies are very effective for various big data applications, such as Spark SQL. The proposed cache policy scheduling algorithms with various eviction policies can improve around 20% hit ratio than that with a single eviction policy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.