Abstract

The Parallel Coordinates Plot (PCP) is a popular technique for the exploration of high-dimensional data. In many cases, researchers apply it as an effective method to analyze and mine data. However, when today’s data volume is getting larger, visual clutter and data clarity become two of the main challenges in parallel coordinates plot. Although Arc Coordinates Plot (ACP) is a popular approach to address these challenges, few optimization and improvement have been made on it. In this paper, we do three main contributions on the state-of-the-art PCP methods. One approach is the improvement of visual method itself. The other two approaches are mainly on the improvement of perceptual scalability when the scale or the dimensions of the data turn to be large in some mobile and wireless practical applications. 1) We present an improved visualization method based on ACP, termed as double arc coordinates plot (DACP). It not only reduces the visual clutter in ACP, but use a dimension-based bundling method with further optimization to deals with the issues of the conventional parallel coordinates plot (PCP). 2)To reduce the clutter caused by the order of the axes and reveal patterns that hidden in the data sets, we propose our first dimensional reordering method, a contribution-based method in DACP, which is based on the singular value decomposition (SVD) algorithm. The approach computes the importance score of attributes (dimensions) of the data using SVD and visualize the dimensions from left to right in DACP according the score in SVD. 3) Moreover, a similarity-based method, which is based on the combination of nonlinear correlation coefficient and SVD algorithm, is proposed as well in the paper. To measure the correlation between two dimensions and explains how the two dimensions interact with each other, we propose a reordering method based on non-linear correlation information measurements. We mainly use mutual information to calculate the partial similarity of dimensions in high-dimensional data visualization, and SVD is used to measure global data. Lastly, we use five case scenarios to evaluate the effectiveness of DACP, and the results show that our approaches not only do well in visualizing multivariate dataset, but also effectively alleviate the visual clutter in the conventional PCP, which bring users a better visual experience.

Highlights

  • Parallel Coordinates Plot (PCP) is a simple but strong geometric high-dimensional data visualization method [1,2,3], which represents N-dimensional data in a 2-Dimensional space with mathematical rigorousness

  • 1) We present an improved visualization method based on arcbased parallel coordinate plots (ACP), termed as double arc coordinates plot (DACP)

  • It reduces the visual clutter in ACP, but use a dimension-based bundling method with further optimization to deals with the issues of the conventional parallel coordinates plot (PCP). 2)To reduce the clutter caused by the order of the axes and reveal patterns that hidden in the data sets, we propose our first dimensional reordering method, a contribution-based method in DACP, which is based on the singular value decomposition (SVD) algorithm

Read more

Summary

Introduction

Parallel Coordinates Plot (PCP) is a simple but strong geometric high-dimensional data visualization method [1,2,3], which represents N-dimensional data in a 2-Dimensional space with mathematical rigorousness. The axes are re-organized and visualized as double arc parallel coordinates from left to right according to their contribution rates, which are calculated by the contribution of each dimension. This helps to find out the optimal order of axes in a short time period. This paper is organized as follows: we first present previous works on existing enhancements in PCP and researches on dimension reordering in high-dimensional data visualization (Section 2). We describe the double arc coordinate method theoretically in the novel coordinates system and describe the bundling layout based on dimension in our approach (Section 3).

Rationale of PCP
Improvements on PCP
Line - based approaches
Axes – Based approaches
External approaches
Double arc coordinate system
Bundling layout
Further optimization for bundling layout
Axes re-ordering methods
Contribution-based re-ordering
Similarity-based re-ordering
Application
The comparison between DACP and ACP
The dimension-based layout in DACP
Iris dataset
Occupancy detection dataset
Contribution-based reordering visualization
Similarity-based reordering visualization
Findings
Conclusion and future work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call