Abstract

The nonlinear mapping to the feature space is a very important concept in kernel-based machine learning for signal processing, within the framework of positive semidefinite (psd) kernels. Given labeled data, algorithms such as support vector machines (SVMs) or projection methods such as Fisher discriminant analysis may be executed in the feature space. For unsupervised dimensionality reduction in the feature space, the most common approach is to perform principal component analysis (PCA) in that space, thus maximally capturing the variability of the feature space data, however, doing so without necessarily capturing any cluster structure in the data. In this tutorial, we review the theory behind feature space mapping and review recent advances that broaden the understanding and interpretability of the mapping in terms of a key input space quantity, particularly the quadratic Renyi entropy of the data via the eigenvalues and eigenfunctions of a psd convolution operator. Focusing on the unsupervised case, we describe the identification of entropy-relevant dimensions in the feature space. We review recent results showing that these dimensions capture structure in the data in the form of clusters, and that they are, in general, different from the kernel PCA (KPCA) dimensions. Differences between these approaches to dimensionality reduction for visualization and clustering are illustrated.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call