While many nonlinear pattern recognition and data mining tasks rely on embedding the data into a latent space, one often needs to extract the patterns in the input space. Estimating the inverse of the nonlinear embedding is the so-called pre-image problem. Several strategies have been proposed to address the estimation of the pre-image; However, there are no theoretical results so far to understand the pre-image problem and its resolution. In this paper, we provide theoretical underpinnings of the resolution of the pre-image problem in Machine Learning. These theoretical results are on the gradient descent optimization, the fixed-point iteration algorithm and Newton’s method. We provide sufficient conditions on the convexity/nonconvexity of the pre-image problem. Moreover, we show that the fixed-point iteration is a Newton update and prove that it is a Majorize-Minimization (MM) algorithm where the surrogate function is a quadratic function. These theoretical results are derived for the wide classes of radial kernels and projective kernels. We also provide other insights by connecting the resolution of this problem to the gradient density estimation problem with the so-called mean shift algorithm.
Read full abstract