Block Placement in Distributed File Systems Based on Block Access Frequency

Jianwei Liao,Francois Trahay,Xiaoning Peng,Zhigang Cai

doi:10.1109/access.2018.2851571

Jianwei Liao, Francois Trahay + Show 2 more

Open Access

https://doi.org/10.1109/access.2018.2851571

Copy DOI

Abstract

This paper proposes a new data placement policy to allocate data blocks across storage servers of the distributed/parallel file systems, for yielding even block access workload distribution. To this end, we first analyze the history of block access sequence of a specific application and then introduce a k-partition algorithm to divide data blocks into multiple groups, by referring their access frequency. After that, each group has almost the same access workloads, and we can thus distribute these block groups onto storage servers of the distributed file system, to achieve the goal of uniformly assigning data blocks when running the application. In summary, this newly proposed data placement policy can yield not only an even data distribution but also the block data access balance. The experimental results show that the proposed scheme can greatly reduce I/O time and better improve utilization of storage servers when running the database-relevant applications, compared with the commonly used block data placement strategy, i.e., the round-robin placement policy.

Full Text