Abstract

Data mining aims at extraction of previously unidentified information from large databases. It can be viewed as an automated application of algorithms to discover hidden patterns and to extract knowledge from data. Online Analytical Processing (OLAP) systems, on the other hand, allow exploring and querying huge datasets in interactive way. These OLAP systems are the predominant front-end tools used in data warehousing environments and the OLAP system's market has developed rapidly during the last few years. Several works in the past emphasized the integration of OLAP and data mining. More recently, data mining techniques along with OLAP have been applied in decision support applications to analyze large data sets in an efficient manner. However, in order to integrate data mining results with OLAP the data has to be modeled in a particular type of OLAP schema. An OLAP schema is a collection of database objects, including tables, views, indexes and synonyms. Schema generation process was considered a manual task but in the recent years research communities reported their work in automatic schema generation. In this paper, we reviewed literature on the schema generation techniques and highlighted the limitations of the existing works. The review reveals that automatic schema generation has never been integrated with data mining. Hence, we propose a model for data mining and automatic schema generation of three types namely star, snowflake, and galaxy. Hierarchical clustering technique of data mining was used and schema from the clustered data was generated. We have also developed a prototype of the proposed model and validated it via experiments of real-life data set. The proposed model is significant as it supports both integration and automation process.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call