An Ontology-driven MapReduce Framework for Association Rules Mining in Massive Data

Rania Mkhinini Gahar,Olfa Arfaoui,Minyar Sassi Hidri,Nejib Ben Hadj-Alouane

doi:10.1016/j.procs.2018.07.236

Rania Mkhinini Gahar, Olfa Arfaoui + Show 2 more

Open Access

https://doi.org/10.1016/j.procs.2018.07.236

Copy DOI

Journal: Procedia Computer Science	Publication Date: Jan 1, 2018
Citations: 3	License type: cc-by-nc-nd

Affiliation: National Engineering School of Tunis

Abstract

To be competitive, companies need to be able to take advantage of the huge amounts of data, called also Big Data deluge, to predict what might happen in the future. In this way, predictive analytics play an important role for extracting useful information which may extend the business strategy and so gain competitive advantages. Predictive analytics involve data mining algorithms to discover knowledge from huge volumes of data. In this context, Association Rules (ARs) mining is considered as one of the most wide-spread data mining techniques. It is especially based on frequent itemsets mining process. However, when it comes to Big Data, ARs mining algorithms produce a huge amount of ARs, many of which are redundant and unuseful. To overcome this drawback, we propose a ontology-driven Map-Reduce Framework for ARs mining in massive data. Ontologies allow to filter the generated ARs and keep only useful ones. The filtering process is assured by a semantic pruning phase introduced in the Map-Reduce jobs in order to eliminate unuseful candidates from the computing of the Maximal Frequent Itemsets (MFI). This may allow a quantitative and especially qualitative reduction of the number of MFI and subsequently of the ARs. Extensive experiments on several datasets demonstrate the ability to handle massive data for mining ARs.

Full Text