Traditional frequent pattern mining algorithms do not consider different semantic significances (weights) of the items. By considering different weights of the items, weighted frequent pattern (WFP) mining becomes an important research issue in data mining and knowledge discovery area. However, the existing state-of-the-art WFP mining algorithms consider all the data from the very beginning of a database to discover the resultant weighted frequent patterns. Therefore, their approaches may not be suitable for the large-scale data environment such as data streams where the volume of data is huge and unbounded. Moreover, they cannot extract the recent change of knowledge in a data stream adaptively by considering the old information which may not be interesting in the current time period. Another major limitation of the existing algorithms is to scan a database multiple times for finding the resultant weighted frequent patterns. In this paper, we propose a novel large-scale algorithm WFPMDS (Weighted Frequent Pattern Mining over Data Streams) for sliding window-based WFP mining over data streams. By using a single scan of data stream, the WFPMDS algorithm can discover important knowledge from the recent data elements. Extensive performance analyses show that our proposed algorithm is very efficient for sliding window-based WFP mining over data streams.
Read full abstract