Abstract

Finding frequent itemsets is a popular data mining problem, aiming to extract hidden patterns from a transactional database. Several bio-inspired approaches to solve this problem have been proposed to overcome the poor performance of exact algorithms, such as Apriori and FPGrowth. Approaches based on genetic algorithms are among the most efficient ones from the point of view of runtime performance, but they are still inefficient in terms of solution’s quality, i.e., the number of frequent itemsets discovered. To deal with this issue, we propose in this paper a new genetic algorithm for finding frequent itemsets called GA-Apriori, in which the crossover and mutation operators are defined by taking into account the Apriori heuristic principle. The results of our evaluation show that GA-Apriori outperforms other approaches to frequent itemset mining based on genetic algorithms, especially when dealing with large instances. The experiments also show that GA-Apriori is competitive with exact approaches in terms of the number of frequent itemsets discovered.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call