Abstract

Mining quantitative association rules is one of the most important tasks in data mining and exists in many real-world problems. Many researches have proved that particle swarm optimization(PSO) algorithm is suitable for quantitative association rule mining (ARM) and there are many successful cases in different fields. However, the method becomes inefficient even unavailable on huge datasets. This paper proposes a parallel PSO for quantitative association rule mining(PPQAR). The parallel algorithm designs two methods, particle-oriented and data-oriented parallelization, to fit different application scenarios. Experiments were conducted to evaluate these two methods. Results show that particle-oriented parallelization has a higher speedup, and data-oriented method is more general on large datasets.

Highlights

  • Association rule mining, as one of the main tasks of knowledge discovery in database, was proposed by Agrawal et al [1] in 1993, in order to extract frequency item sets as well as the hidden correlations, i.e. association rules, among them from transactions in database [2]

  • This paper proposes a parallel PSO algorithm for quantitative association rule mining, i.e. PPQAR, which can run on a distribute cluster and get a remarkable efficiency improvement than the serial version

  • Some previous works paid attention to parallelizing classic association rule mining (ARM) algorithms like Apriori and FP-Growth based on Hadoop Map-Reduce model or Spark RDD framework

Read more

Summary

Introduction

Association rule mining, as one of the main tasks of knowledge discovery in database, was proposed by Agrawal et al [1] in 1993, in order to extract frequency item sets as well as the hidden correlations, i.e. association rules, among them from transactions in database [2]. Yu H et al [11] proposed an improved Apriori algorithm based on the boolean matrix and Hadoop. She X and Zhang L [12] proposed an Apriori parallel improved algorithm based on Map-Reduce distributed architecture. Before introducing the proposed parallel algorithm, some basic concepts, including the definitions of association rule, origin PSO algorithm and its usage in multi-objective optimization, are briefly restated . A transaction t = (tid, X) is termed to include itemset Y if Y ⊆ X.

Objectives
Findings
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.