Abstract

Big data are visually cluttered by overlapping data points. Rather than removing, reducing or reformulating overlap, we propose a simple, effective and powerful technique for density cluster generation and visualization, where point marker (graphical symbol of a data point) overlap is exploited in an additive fashion in order to obtain bitmap data summaries in which clusters can be identified visually, aided by automatically generated contour lines. In the proposed method, the plotting area is a bitmap and the marker is a shape of more than one pixel. As the markers overlap, the red, green and blue (RGB) colour values of pixels in the shared region are added. Thus, a pixel of a 24-bit RGB bitmap can code up to 224 (over 1.6 million) overlaps. A higher number of overlaps at the same location makes the colour of this area identical, which can be identified by the naked eye. A bitmap is a matrix of colour values that can be represented as integers. The proposed method updates this matrix while adding new points. Thus, this matrix can be considered as an up-to-time knowledge unit of processed data. Results show cluster generation, cluster identification, missing and out-of-range data visualization, and outlier detection capability of the newly proposed method.

Highlights

  • Plotted data are visually cluttered by overlapping data points

  • Rather than removing, reducing or reformulating overlap, we propose a simple, effective and powerful technique for density cluster generation and visualization, where point marker overlap is exploited in an additive fashion in order to obtain bitmap data summaries in which clusters can be identified visually, aided by automatically generated contour lines

  • The graphical knowledge unit (GKU) can be seen as a combination of the quadrat sampling method with contour lines

Read more

Summary

Introduction

Plotted data are visually cluttered by overlapping data points. Reducing, avoiding and reformulating (as a cluster) such overlap are the three major techniques recommended for clutter reduction in the data visualization field [1,2,3,4,5]. The method we introduce in this paper incorporates overlaps to generate density clusters without reducing, avoiding or reformulating overlaps. The proposed method requires more overlaps for better cluster formation and better visualization, which contrasts the general practice. The proposed method can be considered as an anytime cluster formation technique (without a separate cluster identification algorithm), which provides faster cluster generation than online methods [8]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.