Probabilistic frequent subtrees for efficient graph classification and retrieval

Pascal Welke,Stefan Wrobel,Tamás Horváth

doi:10.1007/s10994-017-5688-7

Abstract

Frequent subgraphs proved to be powerful features for graph classification and prediction tasks. Their practical use is, however, limited by the computational intractability of pattern enumeration and that of graph embedding into frequent subgraph feature spaces. We propose a simple probabilistic technique that resolves both limitations. In particular, we restrict the pattern language to trees and relax the demand on the completeness of the mining algorithm, as well as on the correctness of the pattern matching operator by replacing transaction and query graphs with small random samples of their spanning trees. In this way we consider only a random subset of frequent subtrees, called probabilistic frequent subtrees, that can be enumerated efficiently. Our extensive empirical evaluation on artificial and benchmark molecular graph datasets shows that probabilistic frequent subtrees can be listed in practically feasible time and that their predictive and retrieval performance is very close even to those of complete sets of frequent subgraphs. We also present different fast techniques for computing the embedding of unseen graphs into (probabilistic frequent) subtree feature spaces. These algorithms utilize the partial order on tree patterns induced by subgraph isomorphism and, as we show empirically, require much less evaluations of subtree isomorphism than the standard brute-force algorithm. We also consider partial embeddings, i.e., when only a part of the feature vector has to be calculated. In particular, we propose a highly effective practical algorithm that significantly reduces the number of pattern matching evaluations required by the classical min-hashing algorithm approximating Jaccard-similarities.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Probabilistic frequent subtrees for efficient graph classification and retrieval

Abstract

Talk to us

Similar Papers

More From: Machine Learning

Lead the way for us

Journal: Machine Learning	Publication Date: Nov 14, 2017
Citations: 4

Similar Papers

Interpretable Neural Subgraph Matching for Graph Retrieval
Indradyumna Roy ... Abir De
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 36
Indradyumna Roy, et. al.Indradyumna Roy ... Abir De
28 Jun 2022
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 36

Fg-index
James Cheng ... An Lu
-
James Cheng, et. al.James Cheng ... An Lu
11 Jun 2007
11 Jun 2007

Near-optimal supervised feature selection among frequent subgraphs
Marisa Thoma ... Le Song
-
Marisa Thoma, et. al.Marisa Thoma ... Le Song
30 Apr 2009
30 Apr 2009

Common design structure discovery from CAD models
Lujie Ma ... Yanwei Wang
-
Lujie Ma, et. al.Lujie Ma ... Yanwei Wang
01 Aug 2009
01 Aug 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Probabilistic frequent subtrees for efficient graph classification and retrieval

Abstract

Talk to us

Similar Papers

More From: Machine Learning