Abstract

A strategy to assist visualization and analysis of large and complex datasets is dimensionality reduction, with which one maps each data point into a low-dimensional manifold. However, various dimensionality reduction techniques are computationally infeasible for large data. Out-of-sample techniques aim to resolve this difficulty; they only apply the dimensionality reduction technique on a small portion of data, referred to as landmarks, and determine the embedding coordinates of the other points using landmarks as references. Out-of-sample techniques have been applied to online settings, or when data arrive as time series. However, existing online out-of-sample techniques use either all the previous data points as landmarks or the fixed set of landmarks and therefore are potentially not good at capturing the geometry of the entire dataset when the time series is non-stationary. To address this problem, we propose an online landmark replacement algorithm for out-of-sample techniques using geometric graphs and the minimal dominating set on them. We mathematically analyse some properties of the proposed algorithm, particularly focusing on the case of landmark multi-dimensional scaling as the out-of-sample technique, and test its performance on synthetic and empirical time-series data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.