Abstract

One of the important issues in data mining is the interestingness problem. Typically, in a data mining process, the number of patterns discovered can easily exceed the capabilities of a human user to identify interesting results. To address this problem, utility measures have been used to reduce the patterns prior to presenting them to the user. The fundamental idea behind mining frequent itemsets is that only item sets with high frequency are of interest to users. However, the practical usefulness of frequent itemsets is limited by the significance of the discovered itemsets. A frequent itemset only reflects the statistical correlation between items, and it does not reflect the semantic significance of the items. In this paper, we are using a utility based itemset mining approach to overcome this limitation. Utility based data mining is a new research area interested in all types of utility factors in data mining processes and targeted at incorporating utility considerations in data mining tasks. High utility itemset mining is a research area of utility based data mining, aimed at finding itemsets that contribute high utility. This paper presents a novel algorithm fast utility mining (FUM) which finds all high utility itemsets within the given utility constraint threshold. It is faster and simpler than the original Umining algorithm. The experimental evaluation on artificial datasets show that our algorithm executes faster than Umining algorithm, when more itemsets are identified as high utility itemsets and when the number of distinct items in the database increases. The proposed FUM algorithm scales well as the size of the transaction database increases with regard to the number of distinct items available.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.