Distributed Placement of Replicas in Hierarchical Data Grids with User and System QoS Constraints

M Shorfuzzaman,P Graham,R Eskicioglu

doi:10.1109/3pgcic.2011.35

Abstract

Data grids support distributed data-intensive applications that need to access massive datasets stored around the world. Ensuring efficient access to such datasets is hindered by the high latencies of wide-area networks. To speed up access, files can be replicated so a user can access a nearby replica. Much of the work on the replica placement problem in data grids has focused on average system performance and ignored quality assurance issues. In the existing work that considers QoS, a simplified replication model is often assumed, therefore, resulting solutions may not be applicable to real systems. In this paper, we introduce a more realistic model for replica placement in hierarchical Data Grids which determines the positions of a minimum number of replicas expected to satisfy certain quality requirements both from user and system perspectives. Our placement algorithm is based on a highly distributed and decentralized technique that exploits the data access history for popular data files and computes replica locations by minimizing overall replication cost (read and update) while maximizing QoS satisfaction for a given traffic pattern. The problem is formulated using dynamic programming. We assess our algorithm using OptorSim. Simulation results demonstrate the effectiveness of our replica placement technique considering various factors such as storage and workload constraints of replica servers, link capacity constraints, user QoS requirements, etc.

Full Text