Abstract

In recent years, the development of Internet enables the rapid growth of global data volume, the arrival of the era of big data has brought great challenges to the traditional computing. Big Data systems, such as hadoop, spark, are becoming important platforms to handle big data, but due to design flaws of big data application itself, and unreasonable distributed framework configuration, the performance of the applications in big data system is difficult to achieve peak speed of computer theory, so how to locate performance bottleneck of big data system and analyze the bottleneck causes is worthy of research. In this paper, a 5-layer performance evaluation model for big data system is proposed, which is a reliable basis for performance analysis, and at the same time, a performance optimization model for big data system is also proposed, which can assist performance bottleneck location and bottleneck analysis, and further optimize performance. Based on these two performance models, an event-based performance tool to profile performance data is implemented. Experimental results show that these two performance models are effective for performance evaluation and optimization of big data system, which can improve average running time of big data system by 19%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call