Principal Direction Divisive Partitioning with Kernels and k-Means Steering

Dimitrios Zeimpekis,Efstratios Gallopoulos

doi:10.1007/978-1-84800-046-9_3

Abstract

Clustering is a fundamental task in data mining. We propose, implement, and evaluate several schemes that combine partitioning and hierarchical algorithms, specifically k-means and principal direction divisive partitioning (PDDP). Using available theory regarding the solution of the clustering indicator vector problem, we use 2means to induce partitionings around fixed or varying cut-points. 2-means is applied either on the data or over its projection on a one-dimensional subspace. These techniques are also extended to the case of PDDP(l), a multiway clustering algorithm generalizing PDDP. To handle data that do not lend themselves to linear separability, the algebraic framework is established for a kernel variant, KPDDP. Extensive experiments demonstrate the performance of the above methods and suggest that it is advantageous to steer PDDP using k-means. It is also shown that KPDDP can provide results of superior quality than kernel k-means.

Full Text