Abstract

This survey paper explained the different approaches of synchronization of replicas of files placed on distributed systems. The survey tells some older and latest techniques of synchronization. Some techniques are by the interference of metadata servers and some are without any intrusion of MDS. In former technique SS storage servers are used for synchronization among replicas. To maximize the performance, scalability and reliability CEPH is a distributed file system. It makes distinction between meta data and data management by object storage file system run on object file systems. Excellent I/O and metadata management is done on CEPH. Commodity servers and disks are used for multitier distributed systems. Performance reliability, I/O rate, workload in writes operations and less overhead in synchronization are the main focus while synchronization of replicas. Hadoop and Google file system are the distributive file systems. Hadoop ensures the better input and output performance with minimal synchronization in replicas, data intensive applications and provides fault tolerance. Some strategies are used for data intensive applications. Parallel file system is type of distributed file system. Analysis enforces the best performance on small and large input output requests. Pattern direct and layout replication technique is one of the most optimized techniques for parallel file system. Data access performance, reliability, data consistency, centralized synchronization, less workload, less overhead is the main focus of all the techniques. Some other file systems like SOFA and frangipani do focus on data consistency and reduce of bandwidth.

Highlights

  • The distributed file system in which files are places in distributed environment improves performance of I/O and reliability of system

  • Data structure adopted by GMEI is chunk list

  • Chunk list is distributed over storage servers

Read more

Summary

Introduction

The distributed file system in which files are places in distributed environment improves performance of I/O and reliability of system. Storage server was responsible for managing file of data, I/O operations after obtaining information from MDS. Different contributions were made by proposed mechanism are: Replication Chunk List Storage Server: It's make sure replica consistenny when ever any unpdate is occoured without intrusion of MDS and contain the data of replicas. In Hadoop file system when any replica is updated or modified replica synchronization is triggered by MDS and storage server only manage and handle data management. Number of write operations tells that how many times synchronization will be occur which sometime exceeds and cross limits To improve it metadata servers was used. CEPH is distributed file system which separates data and metadata management ensures better scalability, performance and reliability. Lustre is the distributed and high performance system that minimizes the availability and scalability problem in distributed file systems.

GMEI file system
Lazy replica
Version based update replay
In Hadoop file system
Asynchronous replication
Replica synchronization in CEPH
In luster
In Google file system
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call