Abstract

The modern economy increasingly relies on ex- ploratory data analysis. Much of this is dependent on data scientists - expert statisticians who process data using statistical tools and programming languages. Our goal is to offer some of this analytical power to end-users who have no statistical training through simple interaction techniques and metaphors. We describe a spreadsheet-based interaction technique that can be used to build and apply sophisticated statistical models such as neural networks, decision trees, support vector machines and linear regression. We present the results of an experiment demon- strating that our prototype can be understood and successfully applied by users having no professional training in statistics or computing, and that the experience of interacting with the system leads them to acquire some understanding of the concepts underlying exploratory statistical modelling. I. INTRODUCTION There are many situations in which end-users need to per- form simple exploratory and interactive data analysis. Typical examples include the analysis of trends in historical data in order to predict future values, estimation of missing data from a data set, or sanity checking of new data by comparison to known information. In this paper, we present a novel tool that supports these kinds of operations within a spreadsheet-like interaction paradigm. Formally, the problem being addressed is mixed-initiative statistical inference or machine learning. Here, the goal is to construct a model that characterises a multivariate training data set, and then apply that model to a test data set to estimate missing values or gauge the correctness of preexisting values. Specialised software packages such as scikit-learn (5), programming languages such as R (6), and deep domain expertise are typically needed in order to build and apply such models. In contrast, our tool only requires the user to be familiar with simple spreadsheet manipulation operations. II. OUR INTERACTION TECHNIQUE

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call