The visualization of streaming high-dimensional data often needs to consider the speed in dimensionality reduction algorithms, the quality of visualized data patterns, and the stability of view graphs that usually change over time with new data. Existing methods of streaming high-dimensional data visualization primarily line up essential modules in a serial manner and often face challenges in satisfying all these design considerations. In this research, we propose a novel parallel framework for streaming high-dimensional data visualization to achieve high data processing speed, high quality in data patterns, and good stability in visual presentations. This framework arranges all essential modules in parallel to mitigate the delays caused by module waiting in serial setups. In addition, to facilitate the parallel pipeline, we redesign these modules with a parametric non-linear embedding method for new data embedding, an incremental learning method for online embedding function updating, and a hybrid strategy for optimized embedding updating. We also improve the coordination mechanism among these modules. Our experiments show that our method has advantages in embedding speed, quality, and stability over other existing methods to visualize streaming high-dimensional data.
Read full abstract