Abstract

Frequent graph pattern mining is one of the most interesting areas in data mining, and many researchers have developed a variety of approaches by suggesting efficient, useful mining techniques by integration of fundamental graph mining with other advanced mining works. However, previous graph mining approaches have faced fatal problems that cannot consider important characteristics in the real world because they cannot process both (1) different element importance and (2) multiple minimum support thresholds suitable for each graph element. In other words, graph elements in the real world have not only frequency factors but also their own importance; in addition, various elements composing graphs may require different thresholds according to their characteristics. However, traditional ones do not consider such features. To overcome these issues, we propose a new frequent graph pattern mining method, which can deal with both different element importance and multiple minimum support thresholds. Through the devised algorithm, we can obtain more meaningful graph pattern results with higher importance. We also demonstrate that the proposed algorithm has more outstanding performance compared to previous state-of-the-art approaches in terms of graph pattern generation, runtime, and memory usage.

Highlights

  • Data mining has been actively researched because of its useful applications such as classifying malicious information on web pages [1], mining consumer attitude and behavior [2], detecting or diagnosing important information [3,4], and other various mining applications [5,6,7,8,9]

  • Traditional frequent pattern mining methods only focusing on databases composed of simple items have limitations that do not deal with complex types of data with graph forms such as network data [15,16,17]

  • Previous approaches of frequent graph pattern mining still have limitations because they cannot consider the following important issues in the real world: (1) the rare item problem [22] showing that items or patterns with low support values as well as ones with high supports may have meaningful information; and (2) the different importance problem [9] signifying that graph elements derived from real world applications have different importance or weight values depending on their characteristics

Read more

Summary

Introduction

Data mining has been actively researched because of its useful applications such as classifying malicious information on web pages [1], mining consumer attitude and behavior [2], detecting or diagnosing important information [3,4], and other various mining applications [5,6,7,8,9]. It is essential to mine such complex data because recent data obtained from real world applications has become increasingly massive and complicated To overcome these limitations, frequent graph pattern mining has been proposed and a variety of related methods have been studied [18,19,20,21] by developing novel techniques for performance improvement or effectively integrating graph mining with other mining fields. Previous approaches of frequent graph pattern mining still have limitations because they cannot consider the following important issues in the real world: (1) the rare item problem [22] showing that items or patterns with low support values as well as ones with high supports may have meaningful information; and (2) the different importance problem [9] signifying that graph elements derived from real world applications have different importance or weight values depending on their characteristics.

Related Work
Preliminaries
Maintaining the Correctness of the Proposed Algorithm
WRG-Miner Algorithm
Experimental Settings
Experimental Results of the Proposed Algorithm
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call