Mining High Utility Itemsets Based on Pattern Growth without Candidate Generation

Yiwei Liu,Le Wang,Bo Jin,Lin Feng

doi:10.3390/math9010035

Yiwei Liu, Le Wang + Show 2 more

Open Access

https://doi.org/10.3390/math9010035

Copy DOI

Journal: Mathematics	Publication Date: Dec 25, 2020
Citations: 1	License type: CC BY 4.0

Affiliation: Dalian University of Technology, Ningbo University

Abstract

Mining high utility itemsets (HUIs) has been an active research topic in data mining in recent years. Existing HUI mining algorithms typically take two steps: generating candidates and identifying utility values of these candidate itemsets. The performance of these algorithms depends on the efficiency of both steps, both of which are usually time-consuming. In this study, we propose an efficient pattern-growth based HUI mining algorithm, called tail-node tree-based high-utility itemset (TNT-HUI) mining. This algorithm avoids the time-consuming candidate generation step, as well as the need of scanning the original dataset multiple times for exact utility values, as supported by a novel tree structure, named the tail-node tree (TN-Tree). The performance of TNT-HUI was evaluated in comparison with state-of-the-art benchmark methods on different datasets. Experimental results showed that TNT-HUI outperformed benchmark algorithms in both execution time and memory use by orders of magnitude. The performance gap is larger for denser datasets and lower thresholds.

Highlights

Pattern discovery from a transactional database has been an important topic in data mining [1,2]
Problem Definition In a transaction dataset, an itemset is a high utility itemset if its utility is not less than a user-specified minimum utility value, where the utility of an item in a transaction is defined as its internal utility multiplied by its external utility
According to Theorem 1, the algorithm TNT-high utility itemsets (HUIs) removes all unpromising items from original transaction itemsets when it creates the tail-node tree (TN-Tree) with transaction itemsets

Summary

Introduction

Pattern discovery from a transactional database has been an important topic in data mining [1,2]. Without the ability of directly retrieving the exact utility values from the tree, existing pattern-growth-based HUI mining methods need to scan the original dataset to identify HUIs, which required additional passes of data I/O, resulting in much computation overhead. To address this issue, we propose a novel tree structure, called tail-node tree (TN-Tree), from which we can retrieve the exact utility values without re-scanning the original dataset.

Apriori-Based HUI Mining Algorithms

Pattern-Growth-Based HUI Mining Algorithms

Basic Concepts

TN-Tree for HUI Mining

The Structure of TN-Tree

TN-Tree Construction

Important Concepts about Sub Trees

Algorithm Description

Comparison with Existing HUI Mining Algorithms

Experimental Results

Evaluation of Computational Efficiency

Evaluation of Memory Usage

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Mining High Utility Itemsets Based on Pattern Growth without Candidate Generation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematics

Lead the way for us

Similar Papers

Mining high average utility itemsets using artificial fish swarm algorithm with computed multiple minimum average utility thresholds
S.S Nandhini ... S Kannimuthu
Journal of Intelligent & Fuzzy Systems | VOL. 46
S.S Nandhini, et. al.S.S Nandhini ... S Kannimuthu
10 Jan 2024
Journal of Intelligent & Fuzzy Systems | VOL. 46

H-Map-Based Technique for Mining High Average Utility Itemset
M S Bhuvaneswari ... K Muneeswaran
IETE Journal of Research | VOL. ahead-of-print
M S Bhuvaneswari, et. al.M S Bhuvaneswari ... K Muneeswaran
27 May 2022
IETE Journal of Research | VOL. ahead-of-print

EHNL: An efficient algorithm for mining high utility itemsets with negative utility value and length constraints
Kuldeep Singh ... Bhaskar Biswas
Information sciences | VOL. 484
Kuldeep Singh, et. al.Kuldeep Singh ... Bhaskar Biswas
24 Jan 2019
Information sciences | VOL. 484

A Survey of High-utility Itemsets Mining
Haijun Yang ... Yonghua Lu
Journal of Physics: Conference Series | VOL. 1624
Haijun Yang, et. al.Haijun Yang ... Yonghua Lu
01 Oct 2020
Journal of Physics: Conference Series | VOL. 1624

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mining High Utility Itemsets Based on Pattern Growth without Candidate Generation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematics