Interactive visual data exploration with subjective feedback: an information-theoretic approach

Kai Puolamäki,Jefrey Lijffijt,Bo Kang,Emilia Oikarinen,Tijl De Bie

doi:10.1007/s10618-019-00655-x

Kai Puolamäki, Jefrey Lijffijt + Show 3 more

Open Access

https://doi.org/10.1007/s10618-019-00655-x

Copy DOI

Abstract

Visual exploration of high-dimensional real-valued datasets is a fundamental task in exploratory data analysis (EDA). Existing projection methods for data visualization use predefined criteria to choose the representation of data. There is a lack of methods that (i) use information on what the user has learned from the data and (ii) show patterns that she does not know yet. We construct a theoretical model where identified patterns can be input as knowledge to the system. The knowledge syntax here is intuitive, such as “this set of points forms a cluster”, and requires no knowledge of maths. This background knowledge is used to find a maximum entropy distribution of the data, after which the user is provided with data projections for which the data and the maximum entropy distribution differ the most, hence showing the user aspects of data that are maximally informative given the background knowledge. We study the computational performance of our model and present use cases on synthetic and real data. We find that the model allows the user to learn information efficiently from various data sources and works sufficiently fast in practice. In addition, we provide an open source EDA demonstrator system implementing our model with tailored interactive visualizations. We conclude that the information theoretic approach to EDA where patterns observed by a user are formalized as constraints provides a principled, intuitive, and efficient basis for constructing an EDA system.

Highlights

Ever since Tukey’s pioneering work on exploratory data analysis (EDA) (Tukey 1977), the task of effectively exploring data has remained an art as much as a science
We present a novel interactive framework for EDA based on solid theoretical principles and taking into account the updating knowledge of the user
Preprocess, whitening, sample, and pca always take less than 2 s each and they are not reported in the table

Summary

Introduction

Ever since Tukey’s pioneering work on exploratory data analysis (EDA) (Tukey 1977), the task of effectively exploring data has remained an art as much as a science. Modern computational methods for dimensionality reduction, such as Projection Pursuit and manifold learning, allow one to spot complex relations from the data automatically and to present them visually. The intuitive idea is that the projection computed shows the maximal difference between the data and the background distribution (i.e., the belief state of the user). Interactive visual data exploration with subjective feedback (a) Background distribution (c) The data in the projection (e) Observed pattern. The new projection displayed is the one that is maximally insightful, considering the updated background distribution. We achieve this through the use of a whitening operation (Kessy et al 2018), which is explained in detail in Sect. The quest to automate the composition of insightful visualizations is important in its own right, as is illustrated in the remainder of the paper

Contributions and outline of the paper

Methods

Preliminaries

Constraints and background distribution

Updating the background distribution

Update rules

About convergence

Whitening operation for finding the most informative visualization

A summary of the proposed interactive framework for EDA

Experiments

Runtime experiment

British National Corpus data

UCI image segmentation data

Proof-of-concept system sideR

Related work

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Data Mining and Knowledge Discovery	Publication Date: Oct 3, 2019
Citations: 4	License type: open-access

R Discovery Prime

R Discovery Prime

Interactive visual data exploration with subjective feedback: an information-theoretic approach

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data Mining and Knowledge Discovery

Lead the way for us

Similar Papers

Collaborative filtering over evolution provenance data for interactive visual data exploration
Houssem Ben Lahmar ... Melanie Herschel
Information Systems | VOL. 95
Houssem Ben Lahmar, et. al.Houssem Ben Lahmar ... Melanie Herschel
18 Aug 2020
Information Systems | VOL. 95

Quantitative Externalization of Visual Data Analysis Results Using Local Regression Models
Krešimir Matković ... Helwig Hauser
-
Krešimir Matković, et. al.Krešimir Matković ... Helwig Hauser
01 Jan 2017
01 Jan 2017

Supporting the sensemaking process in visual analytics

-

18 Nov 2015
18 Nov 2015

Hypothesis Generation in Climate Research with Interactive Visual Data Exploration
Johannes Kehrer ... Andrea Steiner
IEEE Transactions on Visualization and Computer Graphics | VOL. 14
Johannes Kehrer, et. al.Johannes Kehrer ... Andrea Steiner
01 Nov 2008
IEEE Transactions on Visualization and Computer Graphics | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Interactive visual data exploration with subjective feedback: an information-theoretic approach

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data Mining and Knowledge Discovery