Angular quantization based affinity propagation clustering and its application to astronomical big spectra data

Ke Wang,A-Li Luo,Ping Guo

doi:10.1109/bigdata.2015.7363804

Abstract

Affinity Propagation (AP) algorithm is a useful clustering technique with a lot of noteworthy advantages. It has been successfully applied in many applications. However, this algorithm does not scale for large scale data sets because it requires quadratic computational time and memory usage in the problem size. In this paper, we concentrate on the needs of big data analytics and propose an effective and efficient scheme to decrease the computational complexity and memory usage of AP algorithm. The basic idea of our approach is embedding data points in distance-preserving binary codes and then decomposing the original big data set into a series of small subsets by aggregating similar data points according to their binary codes. The experimental results and the real world astronomical spectral data application demonstrate the effectiveness of our approach quantitatively and visually.

Full Text