АРХІТЕКТУРА СИСТЕМИ ДЕДУБЛІКАЦІЇ ТА РОЗПОДІЛУ ДАНИХ У ХМАРНИХ СХОВИЩАХ ПІД ЧАС РЕЗЕРВНОГО КОПІЮВАННЯ

B P Rusyn,M M Osypov,O V Kapshiy,L V Pohreliuk,V A Vysotska,J Y Varetsky

doi:10.31649/1999-9941-2019-45-2-40-63

Abstract

The conceptual model of the system is developed and described in detail. An intelligent system of deduplication and distribution of data in the cloud storage is developed, the description of the software is described, the stages of the user's work are considered. Testing of the projected system was carried out. Several control samples are described and results are analyzed. The purpose of the system is to deduplicate and distribute data in cloud repositories in such a way that the end result of the backup is to eliminate duplicate pieces of data using distributed computing and cloud repositories. By picking the right approach to distribute tasks and data during deduplication, you can harness the full potential of cloud-based distributed systems to increase backup speed and bandwidth. Analyzes (disadvantages and advantages of using different approaches) are analyzed and effective methods of solution are selected: hybrid block-level deduplication, splitting of data flow on the basis of Rabin's digital imprint, distribution of data based on hash values of blocks of deduplication and use of distributed index. Block-level deduplication involves two types of data flow splitting into blocks, a fixed-length, algorithm-based split. Fixed-length partitioning is rather trivial and fast with respect to the complexity of the algorithm, but the downside is that data is displaced at the beginning of the stream, since the blocks that will follow after the changes will be considered new. However, in the case of partitioning of blocks of variable length, the point of proper partitioning is determined by the algorithm. This algorithm should work with infinite data flows using the ring hash function. The algorithm absorbs each input byte of data from the stream, and as soon as the value of the annular hash function corresponds to the previously specified template, it also serves as a point of splitting the stream into blocks. Thus, if the data is changed or displaced by a couple of bytes, only the data block that covers the data will be considered new. However, in order to track changes and correctly set breakpoints, it is necessary to check the input data for a specific preset digital pattern - a hash value. It is a common practice to calculate a hash value every time an input byte is received in a data stream. The point of partition will be the moment when the resulting hash value matches the specified pattern. To do these calculations effectively, an algorithm has been devised for the ring hash. One of the most common ring hash algorithms is a digital Rabin imprint. During the analysis of the solutions, the Rust programming language for client-side writing, the Scala programming language for the server-side, the Akka distributed computing management tool, and Amazon S3 as the cloud repository were selected.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

АРХІТЕКТУРА СИСТЕМИ ДЕДУБЛІКАЦІЇ ТА РОЗПОДІЛУ ДАНИХ У ХМАРНИХ СХОВИЩАХ ПІД ЧАС РЕЗЕРВНОГО КОПІЮВАННЯ

Abstract

Talk to us

Similar Papers

More From: Information Technology and Computer Engineering

Lead the way for us

Similar Papers

Efficient Byte Stream Pattern Test using Bloom Filter with Rolling Hash Functions on the FPGA
Takuma Wada ... Yasuaki Ito
-
Takuma Wada, et. al.Takuma Wada ... Yasuaki Ito
01 Nov 2018
01 Nov 2018

Design Tradeoffs in Applying CAS to File System Backed by Amazon-S3
Nian Xue ... Le Chang
Applied Mechanics and Materials | VOL. 543-547
Nian Xue, et. al.Nian Xue ... Le Chang
01 Mar 2014
Applied Mechanics and Materials | VOL. 543-547

Block based data security and data distribution on multi cloud environment
K Latha ... T Sheela
Journal of Ambient Intelligence and Humanized Computing | VOL. -
K Latha, et. al.K Latha ... T Sheela
20 Jul 2019
Journal of Ambient Intelligence and Humanized Computing | VOL. -

Implementing Location-Based Cryptography on Mobile Application Design to Secure Data in Cloud Storage
Nur Syafiqah Mohd Shamsuddin ... Sakinah Ali Pitchay
Journal of Physics: Conference Series | VOL. 1551
Nur Syafiqah Mohd Shamsuddin, et. al.Nur Syafiqah Mohd Shamsuddin ... Sakinah Ali Pitchay
01 May 2020
Journal of Physics: Conference Series | VOL. 1551

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

АРХІТЕКТУРА СИСТЕМИ ДЕДУБЛІКАЦІЇ ТА РОЗПОДІЛУ ДАНИХ У ХМАРНИХ СХОВИЩАХ ПІД ЧАС РЕЗЕРВНОГО КОПІЮВАННЯ

Abstract

Talk to us

Similar Papers

More From: Information Technology and Computer Engineering