Various studies on high utility pattern mining have been conducted to satisfy the emerging need to consider the characteristics of real-world databases, such as the importance and quantity of items. In the traditional utility-based framework, the mining result is influenced by the number of items in a pattern, or in some cases, single utilities of items. In order to overcome this drawback, high average utility pattern mining has been proposed. It provides more interesting results since it takes into account the average utility of patterns by considering their lengths. Methods based on this concept have emerged in recent years, including ones that target incremental environments. However, existing algorithms create an enormous number of candidate patterns or require complex operations during the mining process. To address this degradation, we propose a new and more efficient approach for mining high average utility patterns in dynamic environments. The proposed algorithm utilizes a data structure more efficient than previous ones, which takes the form of an indexed list. It also incorporates efficient realigning and mining techniques for handling incremental data and accurately mining results. Experimental results show the superiority of the proposed approach in terms of runtime, memory usage, scalability, and accuracy.
Read full abstract