Abstract

Exact approaches to Frequent Itemsets Mining (FIM) are characterised by poor runtime performance when dealing with large database instances. Several FIM bio-inspired approaches have been proposed to overcome this issue. These are considerably more efficient from the point of view of runtime performance, but they still yield poor quality solutions. The quality of the solution, i.e., the number of frequent itemsets discovered, can be increased by improving the randomised search of the solutions space considering intrinsic features of the FIM problem. This paper proposes a new framework for FIM bio-inspired approaches that considers the recursive property of frequent itemsets, i.e., the same feature exploited by the Apriori exact heuristic, in the search of the solution space. We define two new approaches to FIM, namely GA-Apriori and PSO-Apriori, based on the proposed framework, which use genetic algorithms and particle swarm optimisation, respectively. Extensive experiments on synthetic and real database instances show that the proposed approaches outperform other bio-inspired ones in terms of runtime performance. The results also reveal that the performance of PSO-Apriori is comparable to the one of exact approaches Apriori and FPGrowth in respect of the quality of solutions found. We also show that PSO-Apriori outperforms the recently developed BATFIM algorithm when dealing with very large database instances.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.