Abstract

t-SNE (t-distributed Stochastic Neighbor Embedding) is known to be one of the very powerful tools for dimensionality reduction and data visualization. By adopting the student’s t-distribution in the original SNE (Stochastic Neighbor Embedding), t-SNE achieves faster and more stable learning. However, t-SNE still poses computational complexity due to its dependence on KL-divergence. Our goal is to extend t-SNE in a natural way by the framework of information geometry. Our generalized t-SNE can outperform the original t-SNE with a well-chosen set of parameters. Furthermore, the experimental results for MNIST, Fashion MNIST and COIL-20, show that our generalized t-SNE outperforms the original t-SNE.

Highlights

  • In recent years, with the increasing complexity of data, a large number of statistical analysis and machine learning approaches have become increasingly important

  • It is known that SNE suffers from the crowding problem, and to solve this problem, t-distributed Stochastic Neighbor Embedding (t-SNE) [8], which adopts the student’s t-distribution with 1 degree of freedom as the probability distribution after compression, is widely used

  • T-SNE achieves the standardization of SNE with simple modifications, it still suffers from computational instability due to KL-divergence

Read more

Summary

NOTATIONS AND PRELIMINARIES

Let X ⊂ Rd be the d-dimensional input space. The goal of manifold learning or dimensionality reduction is to obtain a map f : X → Y , where Y ⊂ Rs is the output space and s d. In a s-dimensional map {yi}1≤i≤n ⊂ Rs, define the joint probability distribution over all pairs {(yi, y j)}1≤i= j≤n through a symmetric matrix Q = (qi j)1≤i, j≤n, where qi j = 0 as qi j. (Kullback–Leibler divergence [12]) The. P → [0, ∞] is defined between two Radon–Nikodym densities p and q of μ-absolutely continuous probability measures by p. Since q + ε no longer satisfies the condition for a probability measure: X (q + ε)dμ(x) = 1, such an extension is unnatural This computational difficulty is a critical issue in the implementation of t-SNE. (Skew divergence [15, 16]) The skew divergence D(λ )S[p q] : P × P → [0, ∞] is defined between two Radon–Nikodym densities p and q of μ-absolutely continuous probability measures by.

CONTRACTION NOTATION FOR PARTIAL
GEOMETRY OF THE EMBEDDED MANIFOLDS
EXPERIMENTAL RESULTS
Kmax 1
NUMERICAL ANALYSIS FOR THE CONVERGENCE
RELATED WORKS
FUTURE WORKS We have the following future studies
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call