Abstract

As an essential means to eliminate redundant data, data deduplication technology significantly affects today’s era of massive data growth. In recent years, due to the rapid development of a series of related industries, such as marine monitoring, the marine monitoring data has exploded, leading to higher storage costs for marine observation stations. In the face of the surge in data size, we first think of using data deduplication technology to reduce the stored data to save storage costs. However, we have many choices for data deduplication technology. Because-block level data deduplication technology can better complete the task, and the core technology of block-level data deduplication technology is how to cut data blocks, this paper proposes a dual sliding window-based segmentation technology. The structure of double sliding windows makes the divided data block size more average to reduce the consumption of the fingerprint table in memory. At the same time, we add a prediction algorithm to the data deduplication system to predict the cutting point of the data block to improve the cutting efficiency. In addition, we propose a more accurate calculation method of the deduplication ratio, which can more accurately compare the algorithm’s performance and obtain the final experimental results of this paper by using this calculation method. Moreover, we propose a model based on Markov prediction to store massive ocean data, which can save more resources. At the end of the article, we compared the commonly used segmentation algorithms through careful experiments. Finally, we obtained and will use the public dataset experiment to compare the same checking rate at the end of this article.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call