Abstract

Our concern is nonlinear clustering on large-scale dataset. While existing popular kernels (RBF, Polynomials, Spatial Pyramid, etc.) are popularly used for implicitly mapping data into a high-dimensional or infinite dimensional space in order to generalise linear clustering methods, using these kernels cannot make kernel clustering approaches directly applicable for large scale dataset, since large scale kernel matrix or similarity matrix consumes a lot of memory (e.g., 7,450 GB memory over 1 million samples of data). To solve this problem, we introduce an Euler clustering approach. Euler clustering employs Euler kernels in order to intrinsically map the input data onto a complex space of the same dimension as the input or twice, so that Euler clustering can get rid of kernel trick and does not need to rely on any approximation or random sampling on kernel function/matrix, whilst performing a more robust nonlinear clustering against noise and outliers. Moreover, since the original Euler kernel cannot generate a non-negative similarity matrix and thus is inapplicable to spectral clustering, we introduce a positive Euler kernel, and more importantly we have proved when it can generate a non-negative similarity matrix. We apply Euler kernel and the proposed positive Euler kernel to kernel k-means and spectral clustering so as to develop Euler k-means and Euler spectral clustering, respectively. An efficient Stiefel-manifold-based gradient method and an equivalent weighted positive Euler k-means are derived for fast computation of Euler spectral clustering and further alleviating the impact of discretization of the cluster membership indicators in Euler spectral clustering. The results show that the proposed Euler clustering approach achieves overall better clustering performance compared to using popular Mercer kernels and approximation models, whilst keeping the computational complexity of the same magnitude as the most popular linear clustering method k-means.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.