Abstract

In the family of clustering problems we are given a set of objects (vertices of the graph), together with some observed pairwise similarities (edges). The goal is to identify clusters of similar objects by slightly modifying the graph to obtain a cluster graph (disjoint union of cliques). Huffner et al. (Theory Comput. Syst. 47(1), 196---217, 2010) initiated the parameterized study of Cluster Vertex Deletion, where the allowed modification is vertex deletion, and presented an elegant ??min(2kk6logk+n3,2kkmnlogn)$\mathcal {O}\left (\min (2^{k} k^{6} \log k + n^{3}, 2^{k} km\sqrt {n} \log n)\right )$-time fixed-parameter algorithm, parameterized by the solution size. In the last 5 years, this algorithm remained the fastest known algorithm for Cluster Vertex Deletion and, thanks to its simplicity, became one of the textbook examples of an application of the iterative compression principle. In our work we break the 2k-barrier for Cluster Vertex Deletion and present an ??(1.9102k(n+m))$\mathcal {O}(1.9102^{k} (n+m))$-time branching algorithm. We achieve this improvement by a number of structural observations which we incorporate into the algorithm's branching steps.

Highlights

  • The problem to cluster objects based on their pairwise similarities has arisen from applications both in computational biology [5] and machine learning [4]

  • We are working in the setting where we look for a minimum solution to CLUSTERVD on (G, k) not containing v, by Corollary 5, containing a vertex cover of Hv

  • We have presented a new branching algorithm for CLUSTER VERTEX DELETION

Read more

Summary

Introduction

The problem to cluster objects based on their pairwise similarities has arisen from applications both in computational biology [5] and machine learning [4]. Theorem 1 CLUSTER VERTEX DELETION can be solved in O(1.9102k(n + m)) time and polynomial space on an input (G, k) with |V (G)| = n and |E(G)| = m. Such representation leads to a more time- and space-efficient solution for very dense input graphs This would be a better choice if one expects most of the vertices of the resulting cluster graph to form a single clique. The Appendix contains a Python script for computing the worst-case complexity of the algorithm

Preliminaries
Basic Properties
Special Cases of Hv
Algorithm
Preprocessing
Accessing Hv in Linear Time
Subroutine
Main Algorithm
Complexity Analysis
Summary
Co-cluster Setting
Conclusions and Open Problems
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call