The Fragmentation and duplication in data sets is used to overcome the increased data overloading issue in the cloud servers. The increase in data usage and processing in cloud has brought new challenges to data management in cloud computing. We propose the idea which helps in reducing the data load in the cloud and reduces the storage and management cost for the users. Duplication finding plays a very major role in data management. Data de-duplication method finds the restricted fingerprint for every data chunk by using hash algorithms such as MD5 and SHA. The recognized fingerprint is then compared touching other available chunks in a database that is dedicated for storing the chunks. Though, there is simply one copy for every file stored in cloud, it will not be immobile if such a file is owned by a massive number. As a part the de-duplication system improves storage consumption whereas dropping reliability. Aiming to contradict with the above safety challenges, this proposed idea makes the first attempt to provide the idea of distributed dependable de-duplication system. This new distributed de-duplication system comes with increased reliability in which the data chunks are distributed diagonally to various cloud servers. This allows the redundancy of all data is eliminated. The security needs of data privacy and tag consistency are also achieve by introducing a deterministic furtive sharing system in distributed storage systems, as an option of using convergent encryption as in foregoing de-duplication system. Keywords: Cloud, Cloud Storage, Data Mapping, File Data Security, Fragmentation, Graph Colouring Algorithm, Graphical Representation, Node Allocation, Performance
Read full abstract