MapDiff-FI : Map different sets for frequent itemsets mining

Thaweesak Khongtuk,Chuleerat Jaruskulchai,Salin Boonbrahm

doi:10.1063/1.5055469

Abstract

Mining frequent sets is one of the fundamental methods from the prospering field of data mining that describe relationships between items in data sets. The size of data sets required for discovery frequent itemsets plays an important role. In recent years, some data structure based on different sets have been proposed, which have shown to be efficient and scalable for mining frequent itemsets. In this paper, we propose Map Different Sets (MapDiff), a novel and more efficient itemset representation, for mining frequent itemsets. For evaluating the performance of MapDiff, we have conducted extensive experiments to compare it with original data sets on a variety of real datasets and synthetic datasets from UCI and IBM. The experimental results showed that MapDiff structure can be reduce the size of datasets with keep all information of original data.Mining frequent sets is one of the fundamental methods from the prospering field of data mining that describe relationships between items in data sets. The size of data sets required for discovery frequent itemsets plays an important role. In recent years, some data structure based on different sets have been proposed, which have shown to be efficient and scalable for mining frequent itemsets. In this paper, we propose Map Different Sets (MapDiff), a novel and more efficient itemset representation, for mining frequent itemsets. For evaluating the performance of MapDiff, we have conducted extensive experiments to compare it with original data sets on a variety of real datasets and synthetic datasets from UCI and IBM. The experimental results showed that MapDiff structure can be reduce the size of datasets with keep all information of original data.

Full Text