Sequence pattern discovery is a fundamental topic in the domain of data mining. It has been widely used to solve various problems (e.g., behavior pattern discovery, gene pattern discovery in bioinformatics, user click pattern mining, etc.). High-utility sequence mining as a novel hot issue is more challenging and has generally attracted plenty of attention. Our paper focuses on mining high-utility sequences in a more complicated environment with high efficiency. Most of the previous methods for utility mining aim to find high-utility sequences suitable for items with positive values, but most real-world situations contain items with both positive and negative values. Several algorithms have been applied to the above sophisticated situation and can be used as our comparing algorithms. In this paper, we introduce the FHUSN (Fast mining High Utility Sequences with Negative item) algorithm to mine high-utility sequences in situations with or without negative utility values. FHUSN utilizes the new utility array to store data. Several new pruning strategies that apply to situations with or without negative values have been used to reduce search space. Experiments are carried out on several benchmark datasets, and experimental results illustrate that our method has better performance.
Read full abstract