Unsupervised dimensionality reduction (DR) aims to preserve input data structure in a low-dimensional (LD) space based on neighborhood information. In contrast, supervised DR intends to improve the learning performance, i.e., classification and regression, in an LD representation. Unfortunately, obtaining the complete label outputs of a data set for real-world applications is hard. Here, we introduce a novel DR framework coupling both available class labels and input feature similarities to extend the well-known t-distributed Stochastic Neighbor Embedding (SNE) for semi-supervised scenarios. Our proposal, termed Semi-Supervised t-SNE (SS.t-SNE), properly fixes the widths of Gaussian neighborhoods to reveal the salient local and global data structures in an LD space. Indeed, our approach is presented as a generalization of unsupervised and supervised versions of t-SNE. SS.t-SNE outperforms other semi-supervised DR methods in data visualization and classification tasks in LD embeddings.
Read full abstract