Various Approaches Proposed for Eliminating Duplicate Data in a System

Roman Čerešňák,Adam Dudáš,Karol Matiaško

doi:10.26552/com.c.2021.4.a223-a232

Roman Čerešňák, Adam Dudáš + Show 1 more

Open Access

https://doi.org/10.26552/com.c.2021.4.a223-a232

Copy DOI

Abstract

The growth of big data processing market led to an increase in the overload of computation data centers, change of methods used in storing the data, communication between the computing units and computational time needed to process or edit the data. Methods of distributed or parallel data processing brought new problems related to computations with data which need to be examined. Unlike the conventional cloud services, a tight connection between the data and the computations is one of the main characteristics of the big data services. The computational tasks can be done only if relevant data are available. Three factors, which influence the speed and efficiency of data processing are - data duplicity, data integrity and data security. We are motivated to study the problems related to the growing time needed for data processing by optimizing these three factors in geographically distributed data centers.      

Highlights

Severe data change led to a change of database types, to a data transfer from the relational databases to non-relational ones, but it caused a rapid development of technology known as distributed data processing
Quickly developing world, the duplicity suppression has a significant influence on the velocity of system performance and the storing of the data to the data storages
The primary objective of many researchers in the area is examining the problem related to the duplicity suppression - user data searching in various data storages and the reduction of the systems needed for the processing of data by system

Summary

Introduction

Severe data change led to a change of database types, to a data transfer from the relational databases to non-relational ones, but it caused a rapid development of technology known as distributed data processing. A considerable effort was made to reduce the issues related to the processes working with distributed data. Optimization of solutions, studied here, is related to data duplicity, data integrity and data security in the geographically distributed centers, to overcome the weak points related to the distributed data processing in a cloud envirnment. The goal is to minimalize the number of duplicities of the big data, reduce the number of accesses and increase data integrity. Main benefits of presented research can be summarized as follows: Based on the growing amount of data, it is necessary to control the data in the system; that the speed of the data processing is increased, but the amount of storage space needed for the data is effectively reduced. Conclusion of the presented research and experiments is provided in the part 5

Related work

Design and implementation of the real-time data catching method

One data storage

Multiple data storage

Experiments for the real-time data catching method

TB 2 GB

Results of experiment for the proposed methodology

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Various Approaches Proposed for Eliminating Duplicate Data in a System

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Communications - Scientific letters of the University of Zilina

Lead the way for us

Journal: Communications - Scientific letters of the University of Zilina	Publication Date: Oct 1, 2021
License type: CC BY 4.0

Similar Papers

A Low-Latency Secure Data Outsourcing Scheme for Cloud-WSN
Jing Li ... Zijian Zhang
-
Jing Li, et. al.Jing Li ... Zijian Zhang
01 Mar 2017
01 Mar 2017

Braiding drive data processing through rote learning
Bin Chen ... Shaohua You
Intelligent Decision Technologies | VOL. 18
Bin Chen, et. al.Bin Chen ... Shaohua You
20 Feb 2024
Intelligent Decision Technologies | VOL. 18

Research on Citrus Fruit Freshness Detection Based on Near-Infrared Spectroscopy
Ling Chen ... Yun Su
Processes | VOL. 12
Ling Chen, et. al.Ling Chen ... Yun Su
10 Sep 2024
Processes | VOL. 12

Application of GPU Virtualization Technology in Big Data Processing
Yingzi Wang ... Jue Hou
-
Yingzi Wang, et. al.Yingzi Wang ... Jue Hou
10 Dec 2021
10 Dec 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Various Approaches Proposed for Eliminating Duplicate Data in a System

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Communications - Scientific letters of the University of Zilina