Abstract

Association rules mining (ARM) is one of the most popular tasks of data mining. Although there are many effective algorithms run on binary or discrete-valued data for the problem of ARM, these algorithms cannot run efficiently on data that have numeric-valued attributes. However, in many real-world applications, the data usually consist of numerical values. It is a difficult problem to determine which attributes will be included in the discovered rules; automatically adjust the ranges of the attributes in the most appropriate way; rapidly discover the reduced high-quality rules directly without generating the frequent itemsets ensuring the rules to be comprehensible, surprising, interesting, accurate, and confidential. Furthermore, adjusting all these processes without the need for metrics to be determined a priori for each data set is of great importance in terms of automating this problem. Recently, numerical ARM has been dealt with as a multi-objective problem that best meets different criteria at the same time. In this study, algorithms which consider numerical ARM as a multi-objective optimization problem were examined and the performance analysis of these algorithms was performed for the first time to the best of our knowledge. A comparative analysis of MOPNAR, QAR-CIP-NSGA II, NICGAR, MODENAR, MOEA_Ghosh, and ARMMGA methods in terms of the number of rules, average support, average confidence, average lift, average conviction, average certain factor, average netconf, average yulesQ, and coverage percentage metrics in the real-world data consisting of numerical attributes was performed. The performances these algorithms were tested with single-objective optimization methods for ARM in this study. It is found that MOEA-Ghosh is the most effective multi-objective method in terms of average support and average confidence measures in data sets containing high number records and attributes. The best results in terms of average support value were obtained by MOEA-Ghosh algorithm and the average confidence values were obtained by multi-objective QAR-CIP-NSGAII in data sets containing relatively few records and attributes. Furthermore, it can be concluded that multi-objective algorithms outperformed the single-objective algorithms with respect to average support, lift, certain factor, netconf, and yulesQ metrics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call