Abstract

Data mining is the process of analyzing data so as to get useful information to be exploited by users. Association rules is one of data mining techniques used to detect different correlations and to reveal relationships among data individual items in huge data bases. These rules usually take the following form: if X then Y as independent attributes. An association rule has become a popular technique used in several vital fields of activity such as insurance, medicine, banks, supermarkets… Association rules are generated in huge numbers by algorithms known as Association Rules Mining algorithms. The generation of huge quantities of Association Rules may be time-and-effort consuming this is the reason behind an urgent necessity of an efficient and scaling approach to mine only the relevant and significant association rules. This paper proposes an innovative approach which mines the optimal rules from a large set of Association Rules in a distributive processing way to improve its efficiency and to decrease the running time.

Highlights

  • Big data is an important research topic and it has attracted considerable attention

  • Whereas Association Rules Mining is one of the most common algorithm-based data mining techniques which can be defined as the extractor or generator of interesting relationships and correlations among items in large amounts of data

  • In this paper, we addressed the issue of the distributive Association Rules Mining process

Read more

Summary

INTRODUCTION

Big data is an important research topic and it has attracted considerable attention. The huge numbers of data sets are unused and redundant in the databases of companies, universities, etc. Discovering the unused and redundant information stored in these data bases is grounded on the efficient KDD (Knowledge Discovery in Database) process This latter does retrieve data or let researchers find new information from data [1] and has the ability to reveal the patterns and relationships among large amounts of data in a single or several data sets. To help generate association rules, either by improving the process of "patterns'extraction" or by introducing other criteria and factors in order to determine which rule to keep and which one to discard [3] These algorithms are mainly used to centralize computing systems and relatively evaluate small databases.

BACKGROUND
Distributed machine learning and data mining techniques
Data partitioning
Distributive Association Rules mining
MDPREF Algorithm
Experimental setup
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call