Abstract

In cloud storage systems, users must be able to shut down the application when not in use and restart it from the last consistent state when required. BlobSeer is a data storage application, specially designed for distributed systems, that was built as an alternative solution for the existing popular open-source storage system-Hadoop Distributed File System (HDFS). In a cloud model, all the components need to stop and restart from a consistent state when the user requires it. One of the limitations of BlobSeer DFS is the possibility of data loss when the system restarts. As such, it is important to provide a consistent start and stop state to BlobSeer components when used in a Cloud environment to prevent any data loss. In this paper, we investigate the possibility of BlobSeer providing a consistent state distributed data storage system with the integration of checkpointing restart functionality. To demonstrate the availability of a consistent state, we set up a cluster with multiple machines and deploy BlobSeer entities with checkpointing functionality on various machines. We consider uncoordinated checkpoint algorithms for their associated benefits over other alternatives while integrating the functionality to various BlobSeer components such as the Version Manager (VM) and the Data Provider. The experimental results show that with the integration of the checkpointing functionality, a consistent state can be ensured for a distributed storage system even when the system restarts, preventing any possible data loss after the system has encountered various system errors and failures.

Highlights

  • We first present the details of the experimental setup and explain the findings obtained by considering the restart approach for the Version Manager (VM) and the data provider

  • BlobSeer distributed file systems (DFS) is a data storage system that was built as an alternative solution for existing popular open-source data storage system in distributed systems

  • BlobSeer DFS can be used in emerging distributed technologies such as cloud computing but, one of the limitations with BlobSeer is the possibility of losing data on the system restart

Read more

Summary

Introduction

Modern cloud computing systems (CSSs) [3], built on the principles of a distributed system, are capable of providing large storage and computing capacity in a scalable manner [4,5] with high availability and reliability [6] for a wide range of data, computers, and concurrent-access-intensive services [7,8,9,10,11,12]. Such a capability in a cloud computing system is enabled by distributed file systems (DFS), which is considered to be one of the core components in such distributed systems.

BlobSeer Distributed File System
Checkpointing Mechanisms
Uncoordinated Checkpoints
Coordinated Checkpoints
Communication-Induced Checkpoints
Proposed Approach
Experimental Results
Experimental Setup
Check-Point Restart Approach of Version Manager
Without Restart Approach
With Restart Approach
Checkpoint Restart Approach of Data Provider
Without Restart Approach of Data Provider
With Restart Approach of Data Provider
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call