Hierarchical information clustering by means of topologically embedded graphs.

Won-Min Song,Tomaso Aste,T Di Matteo

doi:10.1371/journal.pone.0031929

Won-Min Song, Tomaso Aste + Show 1 more

Open Access

https://doi.org/10.1371/journal.pone.0031929

Copy DOI

Journal: PLoS ONE	Publication Date: Mar 9, 2012
Citations: 156	License type: CC BY 4.0

Affiliation: Australian National University, University of Kent

Abstract

We introduce a graph-theoretic approach to extract clusters and hierarchies in complex data-sets in an unsupervised and deterministic manner, without the use of any prior information. This is achieved by building topologically embedded networks containing the subset of most significant links and analyzing the network structure. For a planar embedding, this method provides both the intra-cluster hierarchy, which describes the way clusters are composed, and the inter-cluster hierarchy which describes how clusters gather together. We discuss performance, robustness and reliability of this method by first investigating several artificial data-sets, finding that it can outperform significantly other established approaches. Then we show that our method can successfully differentiate meaningful clusters and hierarchies in a variety of real data-sets. In particular, we find that the application to gene expression patterns of lymphoma samples uncovers biologically significant groups of genes which play key-roles in diagnosis, prognosis and treatment of some of the most relevant human lymphoid malignancies.

Highlights

Filtering information out of complex datasets is becoming a central issue and a crucial bottleneck in any scientific endeavor
We apply the DBHT technique to various data sets ranging from artificial data with known clustering and hierarchical structures to real gene expression data
Comparisons are made between the results retrieved by the DBHT technique and some of state-of-the-art cluster analysis techniques such as kmeans++[29], Spectral clustering via Normalized cut on k-nearest neighbor graph [30,31], Self Organizing Map (SOM) [32] and Q-cut [33]

Summary

Introduction

Filtering information out of complex datasets is becoming a central issue and a crucial bottleneck in any scientific endeavor. The requirement of any prior information is a potential problem because often the filtering is one of the preliminary processing on the data and it is performed at a stage where very little information about the system is available Another difficulty may arise from the fact that, in some cases, the reduction of the system into a set of separated local communities may hide properties associated with the global organization. In the literature there exist several methods which can be used to extract clusters and hierarchies [1,2,3] and the application to biology and gene expression data has attracted a great attention in recent years [4,5,6,7] In these established approaches, to extract discrete clusters, one must input some a priori information about their number or define a thresholding value. We propose an alternative method that overcomes these limitations providing both clustering subdivision and hierarchical organization without the need of any prior information, without demanding supervision and without requiring thresholding

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Hierarchical information clustering by means of topologically embedded graphs.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE

Lead the way for us

Similar Papers

A new validity clustering index-based on finding new centroid positions using the mean of clustered data to determine the optimum number of clusters
Ahmed Khaldoon Abdalameer ... Nor Ashidi Mat Isa
Expert Systems with Applications | VOL. 191
Ahmed Khaldoon Abdalameer, et. al.Ahmed Khaldoon Abdalameer ... Nor Ashidi Mat Isa
05 Dec 2021
Expert Systems with Applications | VOL. 191

Fuzzy Clustering Analysis of Power Incomplete Data based on Improved IVAEGAN Model
Yutian Hong ... Jun Lin
International Journal of Advanced Computer Science and Applications | VOL. 13
Yutian Hong, et. al.Yutian Hong ... Jun Lin
01 Jan 2021
International Journal of Advanced Computer Science and Applications | VOL. 13

Adaptive density peak clustering algorithm combined with sparse search
Weiyuan Ma ... Baobin Duan
Journal of Physics: Conference Series | VOL. 2493
Weiyuan Ma, et. al.Weiyuan Ma ... Baobin Duan
01 May 2023
Journal of Physics: Conference Series | VOL. 2493

Combining Geo‐SOM and Hierarchical Clustering to Explore Geospatial Data
Chen‐Chieh Feng ... Yi‐Chen Wang
Transactions in GIS | VOL. 18
Chen‐Chieh Feng, et. al.Chen‐Chieh Feng ... Yi‐Chen Wang
16 Sep 2013
Transactions in GIS | VOL. 18

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Hierarchical information clustering by means of topologically embedded graphs.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE