Abstract

Association rules have important applications in many fields, however, with the explosive growth of information technology in recent years, the mining efficiency of association rules has become a very serious problem. The parallel multi-swarm PSO frequent pattern (PMSPF) algorithm creatively combines the particle swarm optimisation (PSO) algorithm with the frequent pattern-growth (FP-growth) algorithm to greatly improve the mining efficiency of association rules. However, under the computing environment of the Spark cluster, the calculation load is not balanced. Therefore, large amount of data may lead to problems like memory overflow. In this paper, parallel conditional frequent pattern (PCFP) tree algorithm is proposed on the basis of PMSPF. First of all, through data grouping, the problem of too large a data volume to construct FP-tree is solved. Then, through parallel strategy of the condition tree, parallel computing is implemented. The experimental results show that although PCFP algorithm generates certain data redundancy in the process of data grouping, the efficiency of the algorithm is significantly higher than that of the PMSPF algorithm and traditional parallel frequent pattern (PFP) algorithm.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.