Abstract

High utility itemsets (HUIs) mining is the finding of itemsets that satisfy a user-defined minimum utility threshold. Many successful studies in this field have been carried out, however they are all reliant on Tidset techniques, which records the intersection of transactions in a data structure. This paper presents the DCHUIM algorithm which mines the high utility itemset based on the Diffset techniques. Essentially, this mechanism stores the subtraction set of transactions rather than the intersection set. In order to achieve this, a DUL data structure is proposed to store utilities information and subtraction transactions of an itemset. Furthermore, the algorithm also applies pruning strategies such as U-Prune, EUCS-Prune and the concept of closed utility to effectively compress data. Thus, in the mining process, the search space is greatly diminished. Experiment on large datasets including Accidents, Mushroom, Retail, Chainstore and compare the performance of DCHUIM algorithm with HMiner algorithm. The findings indicate that the DCHUIM method outperforms the HMiner algorithm in terms of memory utilization across all databases and outperforms it in terms of time on sparse databases.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.