Abstract

Frequent itemset mining over dynamic data is an important problem in the context of knowledge discovery and data mining. Various data stream models are being used for mining frequent itemsets. In a data stream model the data arrive at high speed such that the algorithms used for mining data streams must process them in strict constraint of time and space. Due to emphasis over recent data and its bounded memory requirement, sliding window model is a widely used model for mining frequent itemset over data stream. In this paper we proposed an algorithm named Variable-Moment for mining both frequent and closed frequent itemset over data stream. The algorithm is appropriate for noticing latest or new changes in the set of frequent itemset by making its window size variable, which is determined dynamically based on the extent of concept drift occurring within the arriving data stream. The size of window expands when there is no concept drift in the arriving data stream and size shrinks when there is a concept change. The relative support instead of absolute support is being used for making the concept of variable window effective. The algorithm uses an in-memory data structure to store frequent itemsets. Data structure gets updated whenever a batch of transaction is added or deleted from the sliding window to output exact frequent itemsets. Extensive experiments on both real and synthetic data show that our algorithm excellently spots the concept changes and adapts itself to the new concept along data stream by adjusting window size.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.