Abstract

HDFS has a very wide range of applications in the field of big data, but HDFS was designed for a homogeneous environment at the beginning. HDFS adopts a static replica management strategy, the storage location and number of file replicas will not change after determination. This strategy will low overall system performance. In this paper, we propose optimized replica management strategy, abbreviated as ORMP to fix this problem. ORMP is based on file heat value and LSTM. File heat value is proposed to evaluate the activity of files. LSTM is used to predict the access times of files. Based on LSTM, the file heat value can be updated regularly, so we can dynamically change the storage location and number of replicas. Experiments show that ORMP is 22.08% faster in reading speed compared with the default replicas management strategy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call