Abstract

The typical big data platform uses the MapReduce programming model. The optimization of platform parameters is the basis of optimization for MapReduce large data analysis and processing algorithms. Based on the Hadoop cloud platform, parameters optimization technique can improve the processing performance of the system. A method of related parameters optimization is provided, by using VMware virtual machine technology in a single node, configuratethe single node as multiple virtual machines, implement the Hadoop distributed platform completely to meet experimental environment, and execute cluster tests. Optimized the related parameters in the Hadoop platform configuration, compared test by using TeraSort procedure before and after the parameter optimization, and test results is analyzed. The experiments show that related parameters optimization has greatly influence to the performance of Hadoop platform. Using this method can get full consideration about the hardware configuration, the cluster number and data size and other factors based on the application environment before the actual project of global deployment, and tune the sample experiments to obtain the optimal combination parameters of cloud platform.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call