Convex Clustering via l 1 Fusion Penalization

Peter Radchenko,Gourab Mukherjee

doi:10.1111/rssb.12226

Abstract

Summary We study the large sample behaviour of a convex clustering framework, which minimizes the sample within cluster sum of squares under an l 1 fusion constraint on the cluster centroids. This recently proposed approach has been gaining in popularity; however, its asymptotic properties have remained mostly unknown. Our analysis is based on a novel representation of the sample clustering procedure as a sequence of cluster splits determined by a sequence of maximization problems. We use this representation to provide a simple and intuitive formulation for the population clustering procedure. We then demonstrate that the sample procedure consistently estimates its population analogue and we derive the corresponding rates of convergence. The proof conducts a careful simultaneous analysis of a collection of M-estimation problems, whose cardinality grows together with the sample size. On the basis of the new perspectives gained from the asymptotic investigation, we propose a key post-processing modification of the original clustering framework. We show, both theoretically and empirically, that the resulting approach can be successfully used to estimate the number of clusters in the population. Using simulated data, we compare the proposed method with existing number-of-clusters and modality assessment approaches and obtain encouraging results. We also demonstrate the applicability of our clustering method to the detection of cellular subpopulations in a single-cell virology study.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of the Royal Statistical Society Series B: Statistical Methodology	Publication Date: Feb 13, 2017
Citations: 68	License type: publisher-specific, author manuscript

R Discovery Prime

R Discovery Prime

Convex Clustering via l 1 Fusion Penalization

Abstract

Talk to us

Similar Papers

More From: Journal of the Royal Statistical Society Series B: Statistical Methodology

Lead the way for us

Similar Papers

Asymptotic properties of bivariate k-means clusters
M Anthony Wong
Communications in Statistics - Theory and Methods | VOL. 11
M Anthony WongM Anthony Wong
01 Jan 1981
Communications in Statistics - Theory and Methods | VOL. 11

Constrained Minimum Sum of Squares Clustering by Constraint Programming
Christel Vrain ... Khanh-Chuong Duong
-
Christel Vrain, et. al.Christel Vrain ... Khanh-Chuong Duong
01 Jan 2015
01 Jan 2015

Clustering with Noising Method
Yan Liu ... Kefei Chen
-
Yan Liu, et. al.Yan Liu ... Kefei Chen
01 Jan 2004
01 Jan 2004

Technical Note: Using k-means clustering to determine the number and position of isocenters in MLC-based multiple target intracranial radiosurgery.
Adam D Yock ... Gwe‐Ya Kim
Journal of applied clinical medical physics | VOL. 18
Adam D Yock, et. al.Adam D Yock ... Gwe‐Ya Kim
20 Jul 2017
Journal of applied clinical medical physics | VOL. 18

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Convex Clustering via l 1 Fusion Penalization

Abstract

Talk to us

Similar Papers

More From: Journal of the Royal Statistical Society Series B: Statistical Methodology