Comparative evaluation of pattern mining techniques: an empirical study

Anindita Borah,Bhabesh Nath

doi:10.1007/s40747-020-00226-4

Abstract

Pattern mining has emerged as a compelling field of data mining over the years. Literature has bestowed ample endeavors in this field of research ranging from frequent pattern mining to rare pattern mining. A precise and impartial analysis of the existing pattern mining techniques has therefore become essential to widen the scope of data analysis using the notion of pattern mining. This paper is therefore an attempt to provide a comparative scrutiny of the fundamental algorithms in the field of pattern mining through performance analysis based on several decisive parameters. The paper provides a structural classification of the widely referenced techniques in four pattern mining categories: frequent, maximal frequent, closed frequent and rare. It provides an analytical comparison of these techniques based on computational time and memory consumption using benchmark real and synthetic data sets. The results illustrate that tree based approaches perform exceptionally well over level wise approaches in case of dense data sets for all the categories. However, for sparse data sets, level wise approaches performed better than the former ones. This study has been carried out with an aim to analyze the pros and cons of the well known pattern mining techniques under different categories. Through this empirical study, an endeavor has been made to enable the researchers identify some fruitful and promising research directions in one of the most remarkable area of research, pattern mining.

Full Text