The Data Problem in Data Mining

Albrecht Zimmermann

doi:10.1145/2783702.2783706

Abstract

Computer science is essentially an applied or engineering science, creating tools. In Data Mining, those tools are supposed to help humans understand large amounts of data. In this position paper, I argue that for all the progress that has been made in Data Mining, in particular Pattern Mining, we are lacking insight into three key aspects: 1) How pattern mining algorithms perform quantitatively, 2) How to choose parameter settings, and 3) How to relate found patterns to the processes that generated the data. I illustrate the issue by surveying existing work in light of these concerns and pointing to the (relatively few) papers that have attempted to fill in the gaps. I argue further that progress regarding those questions is held back by a lack of data with varying, controlled properties, and that this lack is unlikely to be remedied by the ever increasing collection of real-life data. Instead, I am convinced that we will need to make a science of digital data generation, and use it to develop guidance to data practitioners.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

The Data Problem in Data Mining

Abstract

Talk to us

Similar Papers

More From: ACM SIGKDD Explorations Newsletter

Lead the way for us

Journal: ACM SIGKDD Explorations Newsletter	Publication Date: May 21, 2015
Citations: 13

Similar Papers

Research Progress on Software Engineering Data Mining Technology
Fengxian Deng
-
Fengxian DengFengxian Deng
01 Jan 2015
01 Jan 2015

Efficient Analysis of Pattern and Association Rule Mining Approaches
Thabet Slimani ... Amor Lazzez
International Journal of Information Technology and Computer Science | VOL. 6
Thabet Slimani, et. al.Thabet Slimani ... Amor Lazzez
08 Feb 2014
International Journal of Information Technology and Computer Science | VOL. 6

A Technique to Mine the Multi-Relational Patterns Using Relational Tree and a Tree Pattern Mining Algorithm
...
International Review on Computers and Software | VOL. 8
, et. al. ...
30 Apr 2013
International Review on Computers and Software | VOL. 8

Big Data Mining and Artificial Intelligence Based Classification Algorithm
Yuan Yuan
-
Yuan YuanYuan Yuan
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The Data Problem in Data Mining

Abstract

Talk to us

Similar Papers

More From: ACM SIGKDD Explorations Newsletter