Abstract

The present study utilizes VirtualBox virtual environment technology to develop the Personal, small size, Big Data platform that can effectively replicate a VM Hadoop system and provides an environment for developers to easily design and implement Hadoop Map/Reduce programming. This study also performs the benchmark by using the VM Hadoop, small-cluster Hadoop, and NCHC's large-scale Hadoop cluster, Braavos. The benchmark results show that the VM Hadoop is an ideal platform for the Map/Reduce code development and testing purpose, and the Braavos Hadoop cluster is the most appropriate for production runs. Moreover, based on the standard WordCount example, the computing time of Braavos Hadoop cluster is 232 times faster than the small-cluster Hadoop. In addition, an engineering example, the image recognition of flow monitoring, is given to illustrate the way of big image data analytics in the Hadoop system. Finally, the VM Hadoop, in term of a Big Data development platform, is ready for users to download. The first author of this paper would like to give a demonstration for the proposed VM Hadoop system as well as an engineering application.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call