Abstract

As an important technology in computer science, data mining aims to mine hidden, previously unknown, and potentially valuable patterns from databases.High utility negative sequential rule (HUNSR) mining can provide more comprehensive decision-making information than high utility sequential rule (HUSR) mining by taking non-occurring events into account. HUNSR mining is much more difficult than HUSR mining because of two key intrinsic complexities. One is how to define the HUNSR mining problem and the other is how to calculate the antecedent’s local utility value in a HUNSR, a key issue in calculating the utility-confidence of the HUNSR. To address the intrinsic complexities, we propose a comprehensive algorithm called e-HUNSR and the contributions are as follows. (1) We formalize the problem of HUNSR mining by proposing a series of concepts. (2) We propose a novel data structure to store the related information of HUNSR candidate (HUNSRC) and a method to efficiently calculate the local utility value and utility of HUNSRC’s antecedent. (3) We propose an efficient method to generate HUNSRC based on high utility negative sequential pattern (HUNSP) and a pruning strategy to prune meaningless HUNSRC. To the best of our knowledge, e-HUNSR is the first algorithm to efficiently mine HUNSR. The experimental results on two real-life and 12 synthetic datasets show that e-HUNSR is very efficient.

Highlights

  • Data mining is a very important technology in the field of computer science

  • (3) We propose an efficient method to generate HUNSR candidate (HUNSRC) based on high utility negative sequential pattern (HUNSP) and a pruning strategy to prune meaningless HUNSRC

  • The experimental results on two real-life and 12 synthetic datasets show that e-High utility negative sequential rule (HUNSR) is very efficient

Read more

Summary

Introduction

Data mining is a very important technology in the field of computer science. It is a non-trivial process of revealing hidden, previously unknown and potentially valuable information from a large amount of data in the database. Current research topics of utility-driven mining focus primarily on discovering patterns of high value (e.g., high profit) in large databases or analyzing the important factors (e.g., economic factors) in the data mining process [1]. High utility sequential pattern (HUSP) mining is an important task in utility-driven mining [2,3,4,5,6,7,8]. Current HUSP mining algorithms, only consider occurring events and do not take non-occurring events into account, which results in the loss of a lot of useful information [9,10,11]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.