Abstract

SUMMARYKernel K‐means has been a powerful unsupervised machine learning tool for cluster discovery. This paper explores its application towards network segmentation, a type of cluster discovery. Very commonly we encounter networks of extremely large size, making computational complexity a major issue in the algorithmic development. In this regard, the sparse structure of kernel matrices can be an invaluable asset. For partitioning nonvectorial data such as network graphs, we need to use a vector‐free clustering criterion for K‐means. The kernel matrix constructed from nonvectorial data usually needs to be preprocessed to assure Mercer's condition, which is required to guarantee monotonic convergence of the kernel K‐means algorithm. This paper describes a centroid‐free criterion for K‐means so that it can be applied to nonvectorial data such as network segmentation. Such a criterion leads to our introduction of pattern‐centroid similarity which ultimately leads to a kernel trick algorithm based on updating of the pattern‐centroid similarity. Furthermore, by adopting a recursive updating scheme, the recursive kernel‐trick allows a computational saving from O(N2/K) to O(N). For networks with high sparsity structure, the amount of computation required can be further reduced from O(N) to O(Ns), where Ns is the average number of nonzero elements per column in the kernel matrix. Copyright © 2011 John Wiley & Sons, Ltd.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.