Abstract

Mining frequent sets is one of the fundamental methods from the prospering field of data mining that describe relationships between items in data sets. The size of data sets required for discovery frequent itemsets plays an important role. In recent years, some data structure based on different sets have been proposed, which have shown to be efficient and scalable for mining frequent itemsets. In this paper, we propose Map Different Sets (MapDiff), a novel and more efficient itemset representation, for mining frequent itemsets. For evaluating the performance of MapDiff, we have conducted extensive experiments to compare it with original data sets on a variety of real datasets and synthetic datasets from UCI and IBM. The experimental results showed that MapDiff structure can be reduce the size of datasets with keep all information of original data.Mining frequent sets is one of the fundamental methods from the prospering field of data mining that describe relationships between items in data sets. The size of data sets required for discovery frequent itemsets plays an important role. In recent years, some data structure based on different sets have been proposed, which have shown to be efficient and scalable for mining frequent itemsets. In this paper, we propose Map Different Sets (MapDiff), a novel and more efficient itemset representation, for mining frequent itemsets. For evaluating the performance of MapDiff, we have conducted extensive experiments to compare it with original data sets on a variety of real datasets and synthetic datasets from UCI and IBM. The experimental results showed that MapDiff structure can be reduce the size of datasets with keep all information of original data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call