Abstract

Non-linear kernel methods can be approximated by fast linear ones using suitable explicit feature maps allowing their application to large scale problems. We investigate how convolution kernels for structured data are composed from base kernels and construct corresponding feature maps. On this basis we propose exact and approximative feature maps for widely used graph kernels based on the kernel trick. We analyze for which kernels and graph properties computation by explicit feature maps is feasible and actually more efficient. In particular, we derive approximative, explicit feature maps for state-of-the-art kernels supporting real-valued attributes including the GraphHopper and graph invariant kernels. In extensive experiments we show that our approaches often achieve a classification accuracy close to the exact methods based on the kernel trick, but require only a fraction of their running time. Moreover, we propose and analyze algorithms for computing random walk, shortest-path and subgraph matching kernels by explicit and implicit feature maps. Our theoretical results are confirmed experimentally by observing a phase transition when comparing running time with respect to label diversity, walk lengths and subgraph size, respectively.

Highlights

  • Analyzing complex data is becoming more and more important

  • The high memory consumption of this kernel is in accordance with our theoretical analysis, since the multiplication of vertex and edge kernels drastically increases the number of non-zero components of the feature vectors

  • This is explained by the fact that the number of components of the feature vectors of the vertex kernels increases in this case

Read more

Summary

Introduction

Analyzing complex data is becoming more and more important. In numerous application domains, e.g., chem- and bioinformatics, neuroscience, or image and social network analysis, the data is structured and can naturally be represented as graphs. To achieve successful learning we need to exploit the rich information inherent in the graph structure and the annotations of vertices and edges. A popular approach to mining structured data is to design graph kernels measuring the similarity between pairs of graphs. Graphs, composed of labeled vertices and edges, possibly enriched with continuous attributes, are not fixed-length vectors but rather complicated data structures, and standard kernels cannot be used. The various graph kernels proposed in the literature mainly differ in the way the parts are constructed and in the similarity measure used to compare them. Existing graph kernels differ in their ability to exploit annotations, which may be categorical labels or real-valued attributes on the vertices and edges

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call