Maximal frequent subgraph mining (MFSM) is the task of mining only maximal frequent subgraphs, i.e., subgraphs that are not a part of other frequent subgraphs. Although many intelligent systems require MFSM, MFSM is challenging compared to frequent subgraph mining (FSM), as maximal frequent subgraphs lie in the middle of graph lattice, and FSM algorithms must explore an exponential space and an NP-hard subroutine of frequency counting. Different from prior research, which primarily focused on optimal solutions, we introduce pmMine, a progressive graph neural framework designed for MFSM in a single large graph to attain an approximate solution. The framework combines isomorphic graph embedding, non-parametric partitioning, and an efficiently top-down pattern searching strategy. The critical insight that makes pmMine work is to define the concepts of rooted subgraph and isomorphic graph embedding, in which the costly isomorphism subroutine can be efficiently performed using similarity estimation in embedding space. In addition, pmMine returns the patterns identified during the mining process in a progressive manner. We validate the efficiency and effectiveness of our technique through extensive experiments on a variety of datasets spanning various domains.
Read full abstract