Abstract

Dimension reduction is an indispensable part of modern data science, and many algorithms have been developed. However, different algorithms have their own strengths and weaknesses, making it important to evaluate their relative performance, and to leverage and combine their individual strengths. This paper proposes a spectral method for assessing and combining multiple visualizations of a given dataset produced by diverse algorithms. The proposed method provides a quantitative measure – the visualization eigenscore – of the relative performance of the visualizations for preserving the structure around each data point. It also generates a consensus visualization, having improved quality over individual visualizations in capturing the underlying structure. Our approach is flexible and works as a wrapper around any visualizations. We analyze multiple real-world datasets to demonstrate the effectiveness of the method. We also provide theoretical justifications based on a general statistical framework, yielding several fundamental principles along with practical guidance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call