Abstract

HDFS heterogeneous clusters usually have multiple storage media at the same time. How to efficiently read and write file copies and reasonably use various storage media is a problem to be solved. Dynamically adjusting the number of copies is important in HDFS, which can solve the problem of accessing a large number of hot files at the same time and improve the efficiency of cluster services. A method is introduced to calculate the number of dynamic HDFS copies based on file access popularity in this paper. Firstly, an algorithm was proposed to predict file popularity based on the cuckoo search optimization Markov model. The unbiased grey model is used to predict the accessing file's popularity at the next moment according to the recent access of the file. The cuckoo search is used to optimize the Markov model, and the prediction error is corrected. Then, the calculation method of the number of copies is designed based on the prediction of the popularity of the file to be accessed and the availability of the node. The experiment shows that the proposed method has a high fitting degree with the actual value, and the MAPE is 3.08%, and it is the smallest, compared with several commonly used prediction models. In CloudSim4.0 simulation platform, multiple users write 10 files to the cluster at the same time, and the change number of copies is calculated according to the predicted value at the next moment, so as to improve the user access efficiency.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.