Co-Clustering under the Maximum Norm

Laurent Bulteau,Vincent Froese,Sepp Hartung,Rolf Niedermeier

doi:10.3390/a9010017

Abstract

Co-clustering, that is partitioning a numerical matrix into “homogeneous” submatrices, has many applications ranging from bioinformatics to election analysis. Many interesting variants of co-clustering are NP-hard. We focus on the basic variant of co-clustering where the homogeneity of a submatrix is defined in terms of minimizing the maximum distance between two entries. In this context, we spot several NP-hard, as well as a number of relevant polynomial-time solvable special cases, thus charting the border of tractability for this challenging data clustering problem. For instance, we provide polynomial-time solvability when having to partition the rows and columns into two subsets each (meaning that one obtains four submatrices). When partitioning rows and columns into three subsets each, however, we encounter NP-hardness, even for input matrices containing only values from {0, 1, 2}.

Highlights

Co-clustering, known as bi-clustering [1], performs a simultaneous clustering of the rows and columns of a data matrix
A parameterized problem, where each instance consists of the “classical” problem instance I and an integer ρ called parameter, is fixed-parameter tractable (FPT) if there is a computable function f and an algorithm solving any instance in f (ρ) · | I |O(1)
We observed that C O -C LUSTERING ∞ is easy to solve for binary input matrices (Observation 1)

Summary

Introduction

Co-clustering, known as bi-clustering [1], performs a simultaneous clustering of the rows and columns of a data matrix. The problem is, given a numerical input matrix A, to partition the rows and columns of A into subsets minimizing a given cost function (measuring “homogeneity”). For a given subset I of rows and a subset J of columns, the corresponding cluster consists of all entries aij with i ∈ I and j ∈ J. The cost function usually defines homogeneity in terms of distances (measured in some norm) between the entries of each cluster. Note that the variant where clusters are allowed to “overlap”, meaning that some rows and columns are contained in multiple clusters, has been studied [1]. We focus on the non-overlapping variant, which can be stated as follows

C O -C LUSTERING L

Related Work

Our Contributions

Formal Definitions and Preliminaries

Problem Definition

Parameterized Algorithmics

Intractability Results

Constant Number of Clusters

Constant Number of Rows

Clustering into Consecutive Clusters

Tractability Results

Reduction to CNF-SAT Solving

Polynomial-Time Solvability

Fixed-Parameter Tractability

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Algorithms	Publication Date: Feb 25, 2016
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Co-Clustering under the Maximum Norm

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms

Lead the way for us

Similar Papers

Co-Clustering Under the Maximum Norm
Laurent Bulteau ... Sepp Hartung
-
Laurent Bulteau, et. al.Laurent Bulteau ... Sepp Hartung
01 Jan 2014
01 Jan 2014

Two cases of polynomial-time solvability for the coloring problem
D S Malyshev
Journal of Combinatorial Optimization | VOL. 31
D S MalyshevD S Malyshev
23 Sep 2014
Journal of Combinatorial Optimization | VOL. 31

Algorithmic analysis for ridesharing of personal vehicles
Qian-Ping Gu ... Guochuan Zhang
Theoretical Computer Science | VOL. 749
Qian-Ping Gu, et. al.Qian-Ping Gu ... Guochuan Zhang
30 Aug 2017
Theoretical Computer Science | VOL. 749

Eigenvalue problem for tridiagonal matrices arising in the scattering-theory analysis of disordered conductors
Josef A Zuk
Canadian Journal of Physics | VOL. 70
Josef A ZukJosef A Zuk
01 Apr 1992
Canadian Journal of Physics | VOL. 70

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Co-Clustering under the Maximum Norm

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms