Abstract

In this work, we propose a new data visualization and clustering technique for discovering discriminative structures in high-dimensional data. This technique, referred to as cPCA++, is motivated by the fact that the interesting features of a “target” dataset may be obscured by high variance components during traditional PCA. By analyzing what is referred to as a “background” dataset (i.e., one that exhibits the high variance principal components but not the interesting structures), our technique is capable of efficiently highlighting the structures that are unique to the “target” dataset. Similar to another recently proposed algorithm called “contrastive PCA” (cPCA), the proposed cPCA++ method identifies important dataset-specific patterns that are not detected by traditional PCA in a wide variety of settings. However, unlike cPCA, the proposed cPCA++ method does not require a parameter sweep, and as a result, it is significantly more efficient. Several experiments were conducted in order to compare the proposed method to state-of-the-art methods. These experiments show that the proposed method achieves performance that is similar to or better than that of the other methods, while being more efficient.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call