Frequent Pattern Mining over Unstructured Data using Semi-Structured Doc-Model and Pattern Ranking

Sudhir Tirumalasetty,A Divya,D Anusha,D Rahitya Lakshmi,Ch Durga Bhavani

doi:10.32628/cseit206216

Abstract

Frequent pattern mining is an essential data-mining task, with a goal of discovering knowledge in the form of repeated patterns. Many efficient pattern-mining algorithms have been discovered in the last two decades, yet most do not scale to the type of data we are presented with today, the so-called “Big Data”. Scalable parallel algorithms hold the key to solving the problem in this context. This paper reviews recent advances in parallel frequent pattern mining, analysing them through the Big Data lens. Load balancing and work partitioning are the major challenges to be conquered. These challenges always invoke innovative methods to do, as Big Data evolves with no limits. The biggest challenge than before is conquering unstructured data for finding frequent patterns. To accomplish this Semi Structured Doc-Model and ranking of patterns are used.

Full Text