Abstract
Stochastic neighbour embedding (SNE) and its variants are methods of nonlinear dimensionality reduction that involve soft Gaussian neighbourhoods to measure similarities for all pairs of data. In order to build a suitable embedding, these methods try to reproduce in a low-dimensional space the neighbourhoods that are observed in the high-dimensional data space. Previous works have investigated the immunity of such similarities to norm concentration, as well as enhanced cost functions, like sums of Jensen–Shannon divergences. This paper proposes an additional refinement, namely multi-scale similarities, which are averages of soft Gaussian neighbourhoods with exponentially growing bandwidths. Such multi-scale similarities can replace the regular, single-scale neighbourhoods in SNE-like methods. Their objective is then to maximise the embedding quality on all scales, with the best preservation of both local and global neighbourhoods, and also to exempt the user from having to fix a scale arbitrarily. Experiments with several data sets show that the proposed multi-scale approach captures better the structure of data and improves significantly the quality of dimensionality reduction.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.