Abstract
In recent years, with the development of artificial intelligence technologies, mobile smart terminals, and high-bandwidth wireless internet, the scale of data generated by various data sources has continued to expand. Huge amounts of data contain huge economic value, therefore how to store and process the data efficiently becomes very significant. HDFS (Hadoop Distributed File System) has emerged as a typical representative of data-intensive distributed file systems, and it has features such as high fault tolerance, high throughput, and can be deployed on low-cost hardwares. However, the remote procedure call in HDFS is still not good enough to work better in terms of network throughput and abnormal response. This paper presents an optimization method to improve the performance of HDFS. The proposed method dynamically adjusts the RPC (Remote Procedure Call) configurations between NameNode and DataNodes by sensing the data characters that stored in DataNodes. This method can effectively reduce the processing pressure of the NameNode and reduce the network throughput generated by the information transmission between NameNode and DataNodes. It can also reduce the abnormal response time of the whole system. Finally, the extensive experiments show the effectiveness and efficiency of our proposed method.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have