Abstract

A kernel is simply a similarity measure that can be applied to input data in the original representation; for example if the data is originally represented as numbers, then the absolute value of the difference between two numbers can be used as a kernel function. However the learning algorithm itself - this algorithm is the kernel machine [4] — never sees the data in its original form. Instead, the algorithm only sees the values of various kernel functions that have been applied to the original data. This decouples the data representation from the learning algorithm itself, and thus allows the same machine-learning principles to be applied to a wide variety of data types. The advent of kernel machines [2, 3] greatly simplified learning problems where the input data comes in the form strings. String kernels, which are just similarity measures on strings, can be used to train a kernel machine on string data, simply by replacing its existing kernel function with a string kernel function. There are numerous string kernels, but the emphasis is on efficiency. For example, the well-known Levenshtein distance (a.k.a. the edit-distance between two strings) could be used as the basis of a string kernel, but usually this is not done because the edit distance takes quadratic time to compute. One way to construct linear-time string kernels is to use suffix trees [1], which can be used to obtain measures like the longest common substring of two strings, or to get a measure that is close to the number of common substrings.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call