Abstract

New graph structures where node labels are members of hierarchically organized ontologies or taxonomies have become commonplace in different domains, e.g., life sciences. It is a challenging task to mine for frequent patterns in this new graph model which we call taxonomy-superimposed graphs, as there may be many patterns that are implied by the generalization/specialization hierarchy of the associated node label taxonomy. Hence, 'standard' graph mining techniques are not directly applicable.In this paper, we present Taxogram, a taxonomy-superimposed graph mining algorithm that can efficiently discover frequent graph structures in a database of taxonomy-superimposed graphs. Taxogram has two advantages: (i) It performs a subgraph isomorphism test once per class of patterns which are structurally isomorphic, but have different labels, and (ii) it reconciles standard graph mining methods with taxonomy-based graph mining and takes advantage of well-studied methods in the literature. Taxogram has three stages: (a) relabeling nodes in the input database, (b) mining pattern classes/families and constructing associated occurrence indices, and (c) computing patterns and eliminating useless (i.e., over-generalized) patterns by post-processing occurrence indices. Experimental results show that Taxogram is significantly more efficient and more scalable compared to other alternative approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call