Abstract

Data mining and knowledge discovery in databases (KDD) promise to play an important role in the way people interact with databases, especially scientific databases where analysis and exploration operations are essential. The author defines the basic notions in data mining and KDD, defines the goals, presents motivation, and gives a high-level definition of the KDD process and how it relates to data mining. The author then focuses on data mining methods. Basic coverage of a sampling of methods is provided to illustrate the methods and how they are used. The author covers a case study of a successful application in science data analysis: the classification of cataloging of a major astronomy sky survey covering 2 billion objects in the northern sky. The system can outperform human as well as classical computational analysis tools in astronomy on the task of recognizing faint stars and galaxies. The author also covers the problem of scaling a clustering problem to a large catalog database of billions of objects.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call