Enhanced Candidate Generation for Frequent Item Set Generation

M Krishnamurthy,K Manivannan,E Rajalakshmi,A Kannan,A Chilambuchelvan

doi:10.17485/ijst/2015/v8i13/60756

M Krishnamurthy, K Manivannan + Show 3 more

Open Access

https://doi.org/10.17485/ijst/2015/v8i13/60756

Copy DOI

Abstract

Frequent item sets is one of the most investigated fields of data mining. The significant feature is to find new techniques to reduce candidate item sets in order to generate frequent item sets efficiently. This paper introduces an efficient algorithm called Enhanced Candidate Generation for Frequent item set Generation (ECG for FIG) for finding frequent item sets from large databases. The existing algorithm for frequent item set generation scan the original database more than once, use more storage space, take more processing time. The proposed algorithm gives a solution to this by representing the transactions in the database with decimal numbers instead of binary values and strings. The original database is scanned only once and is converted into an equivalent decimal value to reduce the storage space. The subset generation concept is used to generate frequent item sets. Thus the proposed algorithm reduces the scanning time, processing time and the storage space respectively. When compared with the existing algorithms, the proposed algorithm takes very less execution time and memory. When implemented the algorithm using java and tested with WEKA tool, for 400 transactions of twenty five items, ECG for FIG is taking only 800 bytes of memory and 2000000000 ns (two seconds), whereas all the other above mentioned algorithms are taking 20800 bytes of memory and more than two seconds.

Full Text