Abstract

Frequent itemset mining is an important in data mining. Fuzzy data mining can more accurately describe the mining results in frequent itemset mining. Nevertheless, frequent itemsets are redundant for the users. A better way is to show the top-k results accordingly. In this paper, we define the score of fuzzy frequent itemset and propose the problem of top-k fuzzy frequent itemset mining, which, to the best of our knowledge, has never been focused on before. To address this problem, we employ a data structure named TopKFFITree to store the superset of the mining results, which has a significantly reduced size in comparison to all the fuzzy frequent itemsets. Then, we present an algorithm named TopK-FFI to build and maintain the data structure. In this algorithm, we employ a method to prune most of the fuzzy frequent itemsets immediately based on the monotony of itemset score. Theoretical analysis and experimental studies over 4 datasets demonstrate that our proposed algorithm can efficiently decrease the runtime and memory cost, and significantly outperform the naive algorithm Top-k-FFI-Miner.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.