Convex clustering: an attractive alternative to hierarchical clustering.

Gary K Chen,John Michael O Ranola,Eric C Chi,Kenneth Lange

doi:10.1371/journal.pcbi.1004228

Abstract

The primary goal in cluster analysis is to discover natural groupings of objects. The field of cluster analysis is crowded with diverse methods that make special assumptions about data and address different scientific aims. Despite its shortcomings in accuracy, hierarchical clustering is the dominant clustering method in bioinformatics. Biologists find the trees constructed by hierarchical clustering visually appealing and in tune with their evolutionary perspective. Hierarchical clustering operates on multiple scales simultaneously. This is essential, for instance, in transcriptome data, where one may be interested in making qualitative inferences about how lower-order relationships like gene modules lead to higher-order relationships like pathways or biological processes. The recently developed method of convex clustering preserves the visual appeal of hierarchical clustering while ameliorating its propensity to make false inferences in the presence of outliers and noise. The solution paths generated by convex clustering reveal relationships between clusters that are hidden by static methods such as k-means clustering. The current paper derives and tests a novel proximal distance algorithm for minimizing the objective function of convex clustering. The algorithm separates parameters, accommodates missing data, and supports prior information on relationships. Our program CONVEXCLUSTER incorporating the algorithm is implemented on ATI and nVidia graphics processing units (GPUs) for maximal speed. Several biological examples illustrate the strengths of convex clustering and the ability of the proximal distance algorithm to handle high-dimensional problems. CONVEXCLUSTER can be freely downloaded from the UCLA Human Genetics web site at http://www.genetics.ucla.edu/software/

Highlights

Pattern discovery is one of the primary goals of bioinformatics
All points eventually coalesce to a single cluster while k exceeds a particular threshold, which is determined by the separation of the nodes
In convex clustering one can often achieve a predetermined number of clusters by varying the number of nearest neighbors and following the solution path to its final destination

Summary

Introduction

Pattern discovery is one of the primary goals of bioinformatics. Cluster analysis is a broad term for a variety of exploratory methods that reveal patterns based on similarities between data points. Well-known methods such as k-means invoke a fixed number of clusters. The number of clusters is unknown in advance, and it is appealing to vary the number of clusters simultaneously with cluster assignment. In addition to producing visualized and interpretable results, hierarchical clustering is simple to implement and computationally quick. These are legitimate advantages, but they do not compensate for hierarchical clustering’s instability to small data perturbations such as measurement error. Cluster inference can be adversely affected as small errors accumulate

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLOS Computational Biology	Publication Date: May 12, 2015
Citations: 71	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Convex clustering: an attractive alternative to hierarchical clustering.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS Computational Biology

Lead the way for us

Similar Papers

A dual reformulation and solution framework for regularized convex clustering problems
J Pi ... Panos M Pardalos
European Journal of Operational Research | VOL. 290
J Pi, et. al.J Pi ... Panos M Pardalos
15 Sep 2020
European Journal of Operational Research | VOL. 290

Portability for GPU-accelerated molecular docking applications for cloud and HPC: can portable compiler directives provide performance across all platforms?
Mathialakan Thavappiragasam ... Wael Elwasif
-
Mathialakan Thavappiragasam, et. al.Mathialakan Thavappiragasam ... Wael Elwasif
01 May 2022
01 May 2022

Robust Convex Clustering Analysis
...
-
, et. al. ...
01 Dec 2016
01 Dec 2016

A Review of Convex Clustering From Multiple Perspectives: Models, Optimizations, Statistical Properties, Applications, and Connections.
Qiying Feng ... Licheng Liu
IEEE transactions on neural networks and learning systems | VOL. 35
Qiying Feng, et. al.Qiying Feng ... Licheng Liu
01 Oct 2024
IEEE transactions on neural networks and learning systems | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Convex clustering: an attractive alternative to hierarchical clustering.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS Computational Biology