Implementation of Association Rule Mining Algorithms on Distributed Data Processing Platforms

Duygu Sesver,Umut Orçun Turgut,Sabah Tuna,Oya Kalıpsız,Alper Nebi Kanlı,Mehmet S Aktaş

doi:10.1109/ubmk.2019.8907040

Abstract

Association rule mining algorithms are a frequently used data mining tecnique. It is aimed to find the items that are frequently found from the data. Nowadays, large data processing and analysis platforms are not focused on data mining, so they do not offer large-scale libraries for association rule mining algorithms. In the scope of this research, a library has been developed for association rule mining algorithms on a large data processing platform. The Apache Spark platform has been preferred in terms of common usage for the research case study. Implementation methods of different algorithms have been implemented on this platform to benefit from the Map-Reduce programming model. In this context, Apriori, Eclat and Pascal algorithms are implemented for large data platform. The library created by the implementation method we suggest is comparatively analyzed in terms of performance metrics on big data processing platforms with single and multiple nodes. The methods implemented within the scope of the research are also compared with the performance of the FpGrowth algorithm implemented by the Spark platform. The results of our research show that when tested on large scale data, the Apriori algorithm gives much better performance values than the other algorithms when switching from single-node cluster environment to multi-node cluster environment.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Implementation of Association Rule Mining Algorithms on Distributed Data Processing Platforms

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Data Mining Library for Big Data Processing Platforms: A Case Study-Sparkling Water Platform
Elif Cansu Yildiz ... Umut Orcun Turgut
-
Elif Cansu Yildiz, et. al.Elif Cansu Yildiz ... Umut Orcun Turgut
01 Sep 2018
01 Sep 2018

Researching a Distributed Computing Automation Platform for Big Data Processing
Nadezhda Bahareva ... Denis Parfenov
-
Nadezhda Bahareva, et. al.Nadezhda Bahareva ... Denis Parfenov
25 Nov 2020
25 Nov 2020

Advances in MapReduce Big Data Processing: Platform, Tools, and Algorithms
Laith Abualigah ... Bahaa Al Masri
-
Laith Abualigah, et. al.Laith Abualigah ... Bahaa Al Masri
01 Jan 2020
01 Jan 2020

A Sensor Data Processing and Access Platform Based on Hadoop for Smart Environments
Chi-Yi Lin ... Wei-Hsun Huang
-
Chi-Yi Lin, et. al.Chi-Yi Lin ... Wei-Hsun Huang
01 Sep 2014
01 Sep 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Implementation of Association Rule Mining Algorithms on Distributed Data Processing Platforms

Abstract

Talk to us

Similar Papers