Abstract

The present study utilizes VirtualBox virtual environment technology to develop the personal and compact size of multi-node big data VM platform with Spark and Hadoop cluster that can effectively replicate and provides an environment for developers to easily design and implement Spark and Hadoop Map/Reduce programming. By using the multi-node Hadoop VM system, developers can conduct Map/Reduce programing completely the same as that in the real multi-node Hadoop cluster. To demonstrate its capability and applicability, this study performs the benchmark by using the big data VM platform and a physical Multi-Node Hadoop Cluster. Based on the standard WordCount benchmarking, the computing time of the physical multi-node Hadoop cluster is 3.7 times faster than that of VM Hadoop cluster. The benchmark results show that the big data VM platform is an ideal platform for the portal and Map/Reduce programming, Spark programming and testing purposes, and the physical Hadoop cluster is the most appropriate for production runs. In addition, the big data VM platform contains a web portal development module designed to support applications that implement big data computing services for the engineering and science users. Such applications are inherently complex, potentially accessing data from a variety of sources and distributing applications to a variety of clients. This portal development module can act as multiple roles in many projects such as personal portals, small business portals, enterprise portals, educational portal, infrastructure portal, and other types of portals. Finally, the big data VM platform, in term of a big data development platform, is ready for users to download. The first author of this paper would like to give a demonstration for the proposed multi-node big data VM platform.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.