Laplacian embedding (LE) aims to project high-dimensional input data samples, which often contain nonlinear structures, into a low-dimensional space. However, existing distance functions used in the embedding space fail to provide discriminative representations for real-world datasets, especially those related to text analysis or image processing. Cosine similarity measurements are advantageous in dealing with sparse data but are fragile to the impact of outliers and noise from samples or data features. In this work, we propose robust spherical LE (RS-LE), a powerful method that builds the LE on a novel metric that unifies the Euclidean distance and cosine similarity in a spherical space using a robust lp,p -norm. We introduce an efficient iterative algorithm, the proximal alternating linearized minimization (ALM) method, to solve the difficult optimization problem that arises from the new metric, which is neither convex nor smooth. This algorithm allows the solution to achieve both global and objective convergence. Our findings are presented in rigorous theoretical analysis and validated in experiments.
Read full abstract