Abstract

There has been extensive research on dimensionality reduction techniques. While these make it possible to present visually the high-dimensional data in 2D or 3D, it remains a challenge for users to make sense of such projected data. Recently, interactive techniques, such as Feature Transformation, have been introduced to address this. This paper describes a user study that was designed to understand how the feature transformation techniques affect user’s understanding of multi-dimensional data visualisation. It was compared with the traditional dimension reduction techniques, both unsupervised (PCA) and supervised (MCML). Thirty-one participants were recruited to detect visual clusters and outliers using visualisations produced by these techniques. Six different datasets with a range of dimensionality and data size were used in the experiment. Five of these are benchmark datasets, which makes it possible to compare with other studies using the same datasets. Both task accuracy and completion time were recorded for comparison. The results show that there is a strong case for the feature transformation technique. Participants performed best with the visualisations produced with high-level feature transformation, in terms of both accuracy and completion time. The improvements over other techniques are substantial, particularly in the case of the accuracy of the clustering task. However, visualising data with very high dimensionality (i.e., greater than 100 dimensions) remains a challenge.

Highlights

  • With the explosive growth in the size of available data (Big Data), there is an increasing demand to help users better understand the Big Data they have

  • Follow-up paired t-tests with Holm correction revealed that FT-high was significantly more accurate than FT-low (p < 10−13 ), and both FT-low (p < 10−8 ) and Principal Component Analysis (PCA) (p < 0.02) were significantly more accurate than Maximally Collapsing Metric Learning (MCML)

  • This paper described a user study that was designed to understand how feature transformation technique affects the user’s understanding of multi-dimensional data visualisation

Read more

Summary

Introduction

With the explosive growth in the size of available data (Big Data), there is an increasing demand to help users better understand the Big Data they have. A large portion of the Big Data is high dimensional and is notoriously difficult for humans to comprehend because of the lack of physical analogy of data with more than three dimensions. Various dimension reduction techniques have been developed to reduce the data dimensions, so they can be visually displayed [1,2]. Dimensionality Reduction (DR) techniques such as Principal Component Analysis (PCA) and Multidimensional Scaling (MDS). Allow analysts to project multidimensional data to a lower dimensional (2D or 3D) visual display as scatterplot diagrams where patterns such as groups and outliers can be identified. The approach is widely used for explorative analysis of large information spaces. Multimodal Technologies and Interact. 2017, 1, 13; doi:10.3390/mti1030013 www.mdpi.com/journal/mti

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.