Abstract

Graph kernels have become an established and widely-used technique for solving classification tasks on graphs. This survey gives a comprehensive overview of techniques for kernel-based graph classification developed in the past 15 years. We describe and categorize graph kernels based on properties inherent to their design, such as the nature of their extracted graph features, their method of computation and their applicability to problems in practice. In an extensive experimental evaluation, we study the classification accuracy of a large suite of graph kernels on established benchmarks as well as new datasets. We compare the performance of popular kernels with several baseline methods and study the effect of applying a Gaussian RBF kernel to the metric induced by a graph kernel. In doing so, we find that simple baselines become competitive after this transformation on some datasets. Moreover, we study the extent to which existing graph kernels agree in their predictions (and prediction errors) and obtain a data-driven categorization of kernels as result. Finally, based on our experimental results, we derive a practitioner’s guide to kernel-based graph classification.

Highlights

  • Machine learning analysis of large, complex datasets has become an integral part of research in both the natural and social sciences

  • We describe and categorize graph kernels based on properties inherent to their design, such as the nature of their extracted graph features, their method of computation and their applicability to problems in practice

  • We include the shortest-path variant of the Deep Graph Kernel (DeepGK) (Yanardag and Vishwanathan 2015a) with parameters as suggested in Yanardag (2015) (SP feature type, MLE kernel type, window size 5, 10 dimensions)5, the DBR kernel of Bai et al (2014) and the propagation kernel (Prop) (Neumann et al 2016; Neumann 2016) for which we select the number of diffusion iterations by cross-validation and use the settings recommended by the authors for other hyperparameters

Read more

Summary

Introduction

Machine learning analysis of large, complex datasets has become an integral part of research in both the natural and social sciences. Like other random walk kernels, Gärtner et al (2003) define the feature space of their kernel as the label sequences derived from walks, but propose a different method of computation based on the direct product graph of two labeled input graphs. The idea is to iteratively turn the continuous attributes of a graph into discrete labels using randomized hash functions This allows to apply fast explicit graph feature maps, which are limited to graphs with discrete annotations such as the one associated with the WeisfeilerLehman subtree kernel (Shervashidze et al 2011). Based on these results they propose a graph kernel based on frequency counts of the isomorphism type of subgraphs around each vertex up to a certain depth This kernel is able to distinguish the above properties and computable in polynomial time for graphs of bounded degree. Is there a kernel for graphs with continuous attributes that is superior over the other graph kernels in terms of classification accuracy?

Methods
Results and discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.