Multi-relational pattern mining over data streams

Andreia Silva,Cláudia Antunes

doi:10.1007/s10618-014-0394-6

Abstract

The data storage paradigm has changed in the last decade, from operational databases to data repositories that make easier to analyze data and mining information. Among those, the primary multidimensional model represents data through star schemas, where each relation denotes an event involving a set of dimensions or business perspectives. Mining data modeled as a star schema presents two major challenges, namely: mining extremely large amounts of data and dealing with several data tables at the same time. In this paper, we describe an algorithm--Star FP Stream, in detail. This algorithm aims for finding the set of frequent patterns in a large star schema, mining directly the data, in their original structure, and exploring the most efficient techniques for mining data streams. Experiments were conducted over two star schemas, in the healthcare and sales domains.

Full Text