Abstract

The size of data sets being collected and analyzed in data science field is growing rapidly, making traditional big data processing solution prohibitively expensive. Especially when the data sets are too large, distributed techniques are inevitable even for simple embarrassing parallel jobs. However, distributed computing is still inaccessible to a large number of users. For example, many average users are still struggling with complex cluster management and configuration tools[24] even just for summing up a group of numbers in a large data file. In this paper, we present BDViewer, a web-based big data processing and visualizing tool. BDViewer uses JavaScript plugins to enable users to view, process and visualize their large data files just through a web browser. By just clicking a button, users can open a large data file online and view the file contents immediately no matter how large the file is. In the back-end, BDViewer is built on a virtual private cloud system. Users' operations in a web browser are converted into map-reduce jobs and MPI tasks that are executed on the cloud. At the end of this paper, some experiments are carried out, which demonstrate BDViewer's effectiveness and ease of use.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.