Abstract

Traditional pattern mining is designed to handle binary database that assume all items in the database have same importance, there is a limitation to recognize accurate information from real-world databases using traditional method. To solve this problem, the high utility pattern mining approaches from non-binary database have been proposed and actively studied by many researchers. Lately, new data is progressively created with the passage of time in diverse area such as biometric data of a patient diagnosed in a medical device and log data of an internet user, and the volume of a database is gradually increasing. A database with these characteristics is called a dynamic database. Under these circumstances, high utility mining techniques suitable for analyzing dynamic databases have recently been extensively studied. In this paper, we propose a new list-based algorithm that mines high utility patterns considering the arrival time of each transaction in an incremental database environment. That is, our algorithm efficiently performs pattern pruning by using a damped window model that considers the importance of the previously inputted data lower than that of recently inserted data and identifies high utility patterns. Experimental results indicate that our proposed method has better performance than the state-of-the-art techniques in terms of runtime, memory, and scalability.

Highlights

  • As the size of data increases, data mining techniques, one of fields of pattern recognition, that analyze data from massive information and recognize meaningful information are getting attention as an interesting research issue in data analysis

  • We propose an efficient algorithm named Damped High Utility Pattern mining based List (DHUPL) for mining high utility patterns without generating the candidate patterns using the list structure in which data reflecting the difference of importance according to the arrival time of data is stored

  • LIST BASED INCREMENTAL HIGH UTILITY PATTERN MINING USING DAMPED WINDOW MODEL we describe the preliminary knowledge used in our method

Read more

Summary

INTRODUCTION

As the size of data increases, data mining techniques, one of fields of pattern recognition, that analyze data from massive information and recognize meaningful information are getting attention as an interesting research issue in data analysis. (2) An efficient mining technique is proposed that uses the damped window model to recognize the latest significant high utility patterns considering time decaying factor and is useful for processing the time-sensitive database. To the best of our knowledge, ours is the first work to propose list-based high utility pattern mining that processes stream database using time decaying model. The GENHUI [12] for mining high utility patterns by applying a time decaying model has been proposed previously It uses a tree structure called RHUI-tree, so it is inefficient for processing stream database. The reason is that it requires a large amount of memory to manage the data structure and generates many candidate patterns to find the actual patterns To overcome these limitations, we propose list-based algorithm for analyzing stream databases that do not generate candidate patterns during the mining process. Ub(AE) < minutil, so no pattern extension for {AE} occurs

CONSTRUCTING GLOBAL LISTS DATA STRUCTURE WITH ORIGINAL DATABASE
UPDATING GLOBAL LISTS WITH INCREMENTED DATABASE
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call