Divisive hierarchical maximum likelihood clustering

Alok Sharma,Yosvany López,Tatsuhiko Tsunoda

doi:10.1186/s12859-017-1965-5

Abstract

BackgroundBiological data comprises various topologies or a mixture of forms, which makes its analysis extremely complicated. With this data increasing in a daily basis, the design and development of efficient and accurate statistical methods has become absolutely necessary. Specific analyses, such as those related to genome-wide association studies and multi-omics information, are often aimed at clustering sub-conditions of cancers and other diseases. Hierarchical clustering methods, which can be categorized into agglomerative and divisive, have been widely used in such situations. However, unlike agglomerative methods divisive clustering approaches have consistently proved to be computationally expensive.ResultsThe proposed clustering algorithm (DRAGON) was verified on mutation and microarray data, and was gauged against standard clustering methods in the literature. Its validation included synthetic and significant biological data. When validated on mixed-lineage leukemia data, DRAGON achieved the highest clustering accuracy with data of four different dimensions. Consequently, DRAGON outperformed previous methods with 3-,4- and 5-dimensional acute leukemia data. When tested on mutation data, DRAGON achieved the best performance with 2-dimensional information.ConclusionsThis work proposes a computationally efficient divisive hierarchical clustering method, which can compete equally with agglomerative approaches. The proposed method turned out to correctly cluster data with distinct topologies. A MATLAB implementation can be extraced from http://www.riken.jp/en/research/labs/ims/med_sci_math/ or http://www.alok-ai-lab.com

Highlights

Biological data comprises various topologies or a mixture of forms, which makes its analysis extremely complicated
Divisive procedures, which start with the entire dataset, are in general considered safer than agglomerative approaches [21, 23]
The divisive procedure has not been generally used for hierarchical clustering, remaining largely ignored in the literature

Summary

Introduction

Biological data comprises various topologies or a mixture of forms, which makes its analysis extremely complicated With this data increasing in a daily basis, the design and development of efficient and accurate statistical methods has become absolutely necessary. On the other hand, perform clustering in an inverse way as compared to their agglomerative counterparts They begin by considering a group (having all the samples) and divide it into two groups at each stage until all the groups comprise of only a single sample [21, 22]. False decisions made in early stages cannot be corrected later on For this reason, divisive procedures, which start with the entire dataset, are in general considered safer than agglomerative approaches [21, 23]. The divisive procedure has not been generally used for hierarchical clustering, remaining largely ignored in the literature

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Dec 1, 2017
Citations: 24	License type: open-access

R Discovery Prime

R Discovery Prime

Divisive hierarchical maximum likelihood clustering

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

An Efficient Hybrid Hierarchical Document Clustering Method
Yehang Zhu ... Benjamin C M Fung
-
Yehang Zhu, et. al.Yehang Zhu ... Benjamin C M Fung
01 Oct 2008
01 Oct 2008

Opening the black box
Diansheng Guo ... Donna Peuquet
-
Diansheng Guo, et. al.Diansheng Guo ... Donna Peuquet
08 Nov 2002
08 Nov 2002

Parallel Hierarchical Clustering in Linearithmic Time for Large-Scale Sequence Analysis
Qi Mao ... Volker Mai
-
Qi Mao, et. al.Qi Mao ... Volker Mai
01 Nov 2015
01 Nov 2015

Comparison of Single Linkage, Complete Linkage, and Average Linkage Methods on Community Welfare Analysis in Cities and Regencies in East Java
Yanuwar Reinaldi ... Nurissaidah Ulinnuha
Jurnal Matematika, Statistika dan Komputasi | VOL. 18
Yanuwar Reinaldi, et. al.Yanuwar Reinaldi ... Nurissaidah Ulinnuha
02 Sep 2021
Jurnal Matematika, Statistika dan Komputasi | VOL. 18

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Divisive hierarchical maximum likelihood clustering

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics