Maintenance of Discovered High Average-Utility Itemsets in Dynamic Databases

Binbin Zhang,Yinan Shao,Youcef Djenouri,Jerry Lin,Philippe Fournier-Viger

doi:10.3390/app8050769

Abstract

High-utility itemset mining (HUIM) is an extension of traditional frequent itemset mining, which considers both quantities and unit profits of items in a database to reveal highly profitable itemsets regardless of their size. High average-utility itemset mining (HAUIM) is designed to find average-utility itemsets by considering both their utility and the number of items that they contain. Thus, average-utility itemsets are obtained based on a fair utility measurement since the average utility typically does not increase much with the size of itemsets. However, most algorithms for discovering high average utility itemsets are designed to extract patterns from a static database. If the size of a database decreases or increases over time (e.g., as a result of transaction insertions), the database must be scanned again in batch mode to update the results. Thus, previously discovered knowledge is ignored and the time previously spent for pattern extraction is wasted. We thus present an incremental HAUIM algorithm for transaction insertion (FUP-HAUIMI) to maintain information about patterns when a database is updated, based on the FUP concept. An average-utility-list (AUL)-structure is first built by scanning the original database. Then, FUP-HAUIMI selects high average-utility upper-bound itemsets and categorizes them according to four cases. For each case, itemsets are maintained and updated using a specific updating procedure. While traversing the enumeration tree representing the search space in a depth-first way, a join operation is performed to quickly and incrementally update the AUL-structures. Several experiments were carried to evaluate the runtime, memory usage, number of potential patterns (candidates), and the scalability of the designed approach. Results show that the performance of FUP-HAUIMI is excellent compared to the state-of-the-art HAUI-Miner algorithm running in batch mode and the state-of-the-art incremental high-utility pattern mining (IHAUPM) algorithm for incremental average-utility pattern mining.

Highlights

Mining useful or meaningful information is a major KDD (Knowledge Discovery in Database) task, which has been widely considered as interesting and useful for more than two decades
Experiments show that the designed FUP-HAUIMI algorithm has better performance to maintain and update the discovered HAUIs than that of the state-of-the-art HAUI-Miner algorithm running in batch mode and the state-of-the-art incremental IHAUPM algorithm
When some transactions are inserted into the original database, the designed FUP-HAUIMI algorithm first divides the high average-utility upper bound itemset (HAUUBI) into four cases, and the itemsets of each case are respectively, maintained and updated by the designed procedures

Summary

Introduction

Mining useful or meaningful information is a major KDD (Knowledge Discovery in Database) task, which has been widely considered as interesting and useful for more than two decades. A fundamental algorithm named Apriori [1] was first designed to mine association rules (ARs) It discovers patterns in a level-wise way in a static database. An Apriori-like approach was first designed that considers the length (size) of each itemset It calculates the average-utility of each itemset instead of its utility as in HUIM, which provides a flexible way of measuring the importance of itemsets for decision-making. The auub (Average-Utility Upper Bound) model [20] was presented to obtain a downward closure property by maintaining the HAUUBIs (High Average-Utility Upper-Bound Itemsets), reducing the search space to discover the set of HAUIs. Lin et al [21] developed an efficient HAUP-tree The AUL-structure is utilized in the designed algorithm to efficiently keep information for mining patterns and incrementally updating results. Experiments show that the designed FUP-HAUIMI algorithm has better performance to maintain and update the discovered HAUIs than that of the state-of-the-art HAUI-Miner algorithm running in batch mode and the state-of-the-art incremental IHAUPM algorithm

Incremental Pattern Discovery

High Average-Utility Itemset Mining

Preliminaries and Problem Statement

Proposed FUP-HAUIMI Algorithm with Transaction Insertion

The Adapted Fast Updated Concept

Runtime

Memory Usage

Number of Patterns

Scalability

Conclusions and Future Work

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied sciences	Publication Date: May 11, 2018
Citations: 19	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Maintenance of Discovered High Average-Utility Itemsets in Dynamic Databases

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied sciences

Lead the way for us

Similar Papers

Mining high average utility itemsets using artificial fish swarm algorithm with computed multiple minimum average utility thresholds
S.S Nandhini ... S Kannimuthu
Journal of Intelligent & Fuzzy Systems | VOL. 46
S.S Nandhini, et. al.S.S Nandhini ... S Kannimuthu
10 Jan 2024
Journal of Intelligent & Fuzzy Systems | VOL. 46

H-Map-Based Technique for Mining High Average Utility Itemset
M S Bhuvaneswari ... K Muneeswaran
IETE Journal of Research | VOL. ahead-of-print
M S Bhuvaneswari, et. al.M S Bhuvaneswari ... K Muneeswaran
27 May 2022
IETE Journal of Research | VOL. ahead-of-print

TUB-HAUPM: Tighter Upper Bound for Mining High Average-Utility Patterns
Jimmy Ming-Tai Wu ... Philippe Fournier-Viger
IEEE access : practical innovations, open solutions | VOL. 6
Jimmy Ming-Tai Wu, et. al.Jimmy Ming-Tai Wu ... Philippe Fournier-Viger
01 Jan 2018
IEEE access : practical innovations, open solutions | VOL. 6

An Efficient Tree-Based Algorithm for Mining High Average-Utility Itemset
Irfan Yildirim ... Mete Celik
IEEE access : practical innovations, open solutions | VOL. 7
Irfan Yildirim, et. al.Irfan Yildirim ... Mete Celik
01 Jan 2019
IEEE access : practical innovations, open solutions | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Maintenance of Discovered High Average-Utility Itemsets in Dynamic Databases

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied sciences