RARE: Mining colossal closed itemset in high dimensional data

Fatimah Audah Md Zaki,Nurul Fariza Zulkurnain

doi:10.1016/j.knosys.2018.07.025

Fatimah Audah Md Zaki, Nurul Fariza Zulkurnain

Open Access

https://doi.org/10.1016/j.knosys.2018.07.025

Copy DOI

Abstract

The present society has been sculpted into a continuous data generator. In fact, the massive automatic data collection has generated a new genre of dataset, termed as ‘high-dimensional data’, which is characterized by a relatively small number of rows, in comparison to that of large number of columns (or dimensions). Among the vast data mining tasks, association rules have been extensively employed so as to describe the correlations between the variables found in a dataset. The task of mining association rules highly relies on the efficiency of the algorithms to extract all frequent itemsets that exist in the database. The focus towards improving run time and memory consumption of algorithms is strongly influenced by search strategies, effective pruning strategies, and the method of closure checking. Neither depth- nor breadth-first search may exert any variance without these techniques, mainly because the search space appears similar. With that, this paper investigated the strategies implemented in both row and column enumeration-based algorithms, hence proposing the RARE; a breadth-first bottom-up row-enumeration algorithm, in mining colossal closed itemsets in high-dimensional data.

Full Text