Abstract

When it comes to association rule mining, all frequent itemsets are first found, and then the confidence level of association rules is calculated through the support degree of frequent itemsets. As all non-empty subsets in frequent itemsets are still frequent itemsets, all frequent itemsets can be acquired only by finding all maximal frequent itemsets (MFIs), whose supersets are not frequent itemsets. In this study, an algorithm, named right-hand side expanding (RHSE), which can accurately find all MFIs, was proposed. First, an Expanding Operation was designed, which, starting from any given frequent itemset, could add items using certain rules and form some supersets of given frequent itemsets. In addition, these supersets were all MFIs. Next, this operator was used to add items by taking all frequent 1-itemsets as the starting point alternately, and all MFIs were found in the end. Due to the special design of the Expanding Operation, each MFI could be found. Moreover, the path found was unique, which avoided the algorithm redundancy in temporal and spatial complexity. This algorithm, which has a high operating rate, is applicable to the big data of high-dimensional mass transactions as it is capable of avoiding the computing redundancy and finding all MFIs. In the end, a detailed experimental report on 10 open standard transaction sets was given in this study, including the big data calculation results of million-class transactions.

Highlights

  • Association rule mining refers to finding implications such as A ⇒ B from the given transaction set, where A and B are itemsets

  • The effective part of the transaction set can be loaded into the RAM to be accessed, which accelerates the algorithm speed, and provides a basis for single-machine computing of high-dimensional big data with a massive transaction number. 2 The Expanding Operation adopts the strategy of one-way search so that every maximal frequent itemsets (MFIs) can be found and the path to be found is unique

  • Several MFIs were acquired and called the group of MFIs, and these groups of MFIs formed a larger pool of MFIs. This larger pool of MFIs was an exact solution to the problem, and it was a set of MFIs

Read more

Summary

Introduction

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. Exact algorithms are based on cluster computing, which requires multiple computers Both algorithms have their respective advantages and disadvantages when solving the big data association rule mining problem. The motivation of the algorithm proposed in this paper is to realize the MFI mining in an acceptable time range without using cluster computers for the big data with high-dimensional attributes and massive transaction numbers. 1 The reduced transaction set is generated and used without changing the mining results In this way, the effective part of the transaction set can be loaded into the RAM to be accessed, which accelerates the algorithm speed, and provides a basis for single-machine computing of high-dimensional big data with a massive transaction number. The computational redundancy of the algorithm is avoided as much as possible

Right-Hand Side Expanding Algorithm
Transaction Set Preprocessing
Complexity
Proof of Algorithm Accuracy
Integrity Proof
Uniqueness
Experiment
Brief Description of Transaction Sets
Mining Results
Comparison of Algorithm Running Time with Solution Space Size
Comparison of Algorithm Running Time with Traditional Exact Algorithm
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call