Abstract

In a previous paper, we proposed a technique called the top-down mining (TDM) algorithm to speed up the task of mining hybrid sequential patterns. TDM has a unique feature that examines database itemsets in a top-down manner using decomposition. This method, however, may incur a space problem during decomposition transactions from the database. In this paper, therefore, we propose a new algorithm called transaction decomposition with clustering (TDC) to alleviate the space problem associated with the decomposition method. We use TDC to derive frequent itemsets from a large database. The major feature of TDC is that it divides a database into several smaller projected databases such that each portion can be solved with the decomposition method independently. Since a large amount of information does not have to be stored in memory, the TDC method can efficiently mine frequent itemsets. We compare experimental results for the proposed method and existing algorithms. The results show that TDC can solve the space problem of TDM, and TDC outperforms its counterpart algorithms in many cases. Even when the data set is large or the user-specified minimum support is low, the TDC method still exhibits high performance in mining frequent itemsets. This makes the TDC method suitable for mining frequent itemsets in very large databases.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call