Abstract

High utility itemsets (HUIs) mining has been a hot topic recently, which can be used to mine the profitable itemsets by considering both the quantity and profit factors. Up to now, researches on HUIs mining over uncertain datasets and data stream had been studied respectively. However, to the best of our knowledge, the issue of HUIs mining over uncertain data stream is seldom studied. In this paper, PHUIMUS (potential high utility itemsets mining over uncertain data stream) algorithm is proposed to mine potential high utility itemsets (PHUIs) that represent the itemsets with high utilities and high existential probabilities over uncertain data stream based on sliding windows. To realize the algorithm, potential utility list over uncertain data stream (PUS-list) is designed to mine PHUIs without rescanning the analyzed uncertain data stream. And transaction weighted probability and utility tree (TWPUS-tree) over uncertain data stream is also designed to decrease the number of candidate itemsets generated by the PHUIMUS algorithm. Substantial experiments are conducted in terms of run-time, number of discovered PHUIs, memory consumption, and scalability on real-life and synthetic databases. The results show that our proposed algorithm is reasonable and acceptable for mining meaningful PHUIs from uncertain data streams.

Highlights

  • Knowledge discovery in databases (KDD) is an emerging issue since the important, implicit, unknown, and potential useful information can be found from huge databases [1, 2]

  • To the best of our knowledge, Lin et al [19] proposed potential high utility itemsets (PHUIs)-UP based on two-phase model and PHUI-List based on list structure, Lan et al [20] proposed UHUI-apriori based on Apriori, and these are only algorithms that used to solve High utility itemsets (HUIs) mining problem over uncertain databases

  • Because it is considered to be the first work HUIs mining over uncertain data stream and MHUI-TID is an outstanding algorithm for mining HUIs from data streams, the performance of the designed PHUIMUS algorithm is only compared with MHUI-TID

Read more

Summary

Introduction

Knowledge discovery in databases (KDD) is an emerging issue since the important, implicit, unknown, and potential useful information can be found from huge databases [1, 2]. To deal with the new issue of HUIs mining over uncertain data stream, PHUIMUS algorithm is proposed to mine PHUIs over uncertain data stream based on sliding windows. To the best of my knowledge, seldom researches are conducted to deal with the issue of mining HUIs over uncertain data stream that takes both uncertainty and timeliness into account. (2) As HUIs mining over uncertain data stream brings existential probability and sliding windows into consideration, the calculation of items utility, itemsets utility, transaction utility, and transaction weighted utility is changed. (3) PHUIMUS algorithm is proposed to mine PHUIs over uncertain data stream based on the developed PUS-list and TWPUS-tree in the current window, which can efficiently prune the unpromising itemsets and get PHUIs without rescanning the analyzed uncertain data stream.

Related Work
New Definitions and Problem Statement
The Proposed Algorithm for Mining HUIs over Uncertain Data Stream
Experimental Results
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call