Abstract

High-utility sequential pattern mining (HUSPM) is an emerging topic in data mining, which considers both utility and sequence factors to derive the set of high-utility sequential patterns (HUSPs) from the quantitative databases. Several works have been presented to reduce the computational cost by variants of pruning strategies. In this paper, we present an efficient sequence-utility (SU)-chain structure, which can be used to store more relevant information to improve mining performance. Based on the SU-Chain structure, the existing pruning strategies can also be utilized here to early prune the unpromising candidates and obtain the satisfied HUSPs. Experiments are then compared with the state-of-the-art HUSPM algorithms and the results showed that the SU-Chain-based model can efficiently improve the efficiency performance than the existing HUSPM algorithms in terms of runtime and number of the determined candidates.

Highlights

  • Pattern mining is considered to find the valuable relationships between items/objects in the databases, and many variants of knowledge were investigated in different applications and domains, such as association-rule mining (ARM) [1], [10], sequential-pattern mining (SPM) [2], [9], [22], [23], and high-utility-itemset mining (HUIM) [6], [12], [13], [18], among others

  • SPM which discovers high frequent sequence from sequence database, is one of the important research areas in data mining and knowledge discovery since it shows the correlations of the ordered events, which can be applied in many real-life applications and situations

  • In the high-utility sequential pattern mining (HUSPM), a sequence is considered as a high-utility sequential patterns (HUSPs) if its sequence utility is no less than the pre-defined minimum utility value

Read more

Summary

INTRODUCTION

Pattern mining is considered to find the valuable relationships between items/objects in the databases, and many variants of knowledge were investigated in different applications and domains, such as association-rule mining (ARM) [1], [10], sequential-pattern mining (SPM) [2], [9], [22], [23], and high-utility-itemset mining (HUIM) [6], [12], [13], [18], among others. Several algorithms [9], [22], [23] have been proposed to improve the mining efficiency regarding SPM but most of them do not, consider the other factors or attributes in the databases (i.e., importance, weight or interestingness) To solve this limitation and provide more useful and meaingful information, the high-utility sequential pattern mining (HUSPM) [17], [29], [32], [33] was presented to consider both utility. ProUM [7] and HUSP-ULL [8] are the state-of-the-art approaches by introducing the projection mechanism and efficient pruning strategies to mine the HUSPs. The above algorithms still suffer the limitation of memory usage (i.e., the state-of-the-art HUSP-ULL), we design an efficient sequence-utility (SU)-Chain structure to keep more information for the later mining progress.

LITERATURE REVIEW
EXPERIMENTAL EVALUATION
Findings
CONCLUSION AND FUTURE WORK
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.