Abstract

The recent Bigdata and IoT era has presented a number of applications that generate objects in a streaming fashion. It is well-known that real-time mining of important patterns from data streams support many domains. In retail markets and social network services, for example, such patterns are itemsets and words that frequently appear in many user-accounts, i.e., co-occurrence patterns . To efficiently monitor co-occurrence patterns, we address the novel problem of mining top-k closed co-occurrence patterns across multiple streams. We employ sliding window setting in this problem, and each pattern is ranked based on count, which is the number of streams that have generated the pattern. Since objects are consecutively generated and deleted, the count of a given pattern is dynamic, which may change the rank of the pattern. This renders a challenge to monitoring the top-k answer in real-time. We propose an index-based algorithm that addresses the challenge and provides the exact answer. Specifically, we propose the CP-Graph, a hybrid index of graph and inverted file structures. The CP-Graph can efficiently compute the count of a given pattern and update the answer while pruning unnecessary patterns. Our experimental study on real datasets demonstrates the efficiency and scalability of our solution.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.