Abstract

The growth of big data processing market led to an increase in the overload of computation data centers, change of methods used in storing the data, communication between the computing units and computational time needed to process or edit the data. Methods of distributed or parallel data processing brought new problems related to computations with data which need to be examined. Unlike the conventional cloud services, a tight connection between the data and the computations is one of the main characteristics of the big data services. The computational tasks can be done only if relevant data are available. Three factors, which influence the speed and efficiency of data processing are - data duplicity, data integrity and data security. We are motivated to study the problems related to the growing time needed for data processing by optimizing these three factors in geographically distributed data centers.
 
 
 
 
 
 

Highlights

  • Severe data change led to a change of database types, to a data transfer from the relational databases to non-relational ones, but it caused a rapid development of technology known as distributed data processing

  • Quickly developing world, the duplicity suppression has a significant influence on the velocity of system performance and the storing of the data to the data storages

  • The primary objective of many researchers in the area is examining the problem related to the duplicity suppression - user data searching in various data storages and the reduction of the systems needed for the processing of data by system

Read more

Summary

Introduction

Severe data change led to a change of database types, to a data transfer from the relational databases to non-relational ones, but it caused a rapid development of technology known as distributed data processing. A considerable effort was made to reduce the issues related to the processes working with distributed data. Optimization of solutions, studied here, is related to data duplicity, data integrity and data security in the geographically distributed centers, to overcome the weak points related to the distributed data processing in a cloud envirnment. The goal is to minimalize the number of duplicities of the big data, reduce the number of accesses and increase data integrity. Main benefits of presented research can be summarized as follows: Based on the growing amount of data, it is necessary to control the data in the system; that the speed of the data processing is increased, but the amount of storage space needed for the data is effectively reduced. Conclusion of the presented research and experiments is provided in the part 5

Related work
Design and implementation of the real-time data catching method
One data storage
Multiple data storage
Experiments for the real-time data catching method
TB 2 GB
Results of experiment for the proposed methodology
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call