Clustering on hierarchical heterogeneous data with prior pairwise relationships

Wei Han,Sanguo Zhang,Hailong Gao,Deliang Bu

doi:10.1186/s12859-024-05652-6

Abstract

BackgroundClustering is a fundamental problem in statistics and has broad applications in various areas. Traditional clustering methods treat features equally and ignore the potential structure brought by the characteristic difference of features. Especially in cancer diagnosis and treatment, several types of biological features are collected and analyzed together. Treating these features equally fails to identify the heterogeneity of both data structure and cancer itself, which leads to incompleteness and inefficacy of current anti-cancer therapies.ObjectivesIn this paper, we propose a clustering framework based on hierarchical heterogeneous data with prior pairwise relationships. The proposed clustering method fully characterizes the difference of features and identifies potential hierarchical structure by rough and refined clusters.ResultsThe refined clustering further divides the clusters obtained by the rough clustering into different subtypes. Thus it provides a deeper insight of cancer that can not be detected by existing clustering methods. The proposed method is also flexible with prior information, additional pairwise relationships of samples can be incorporated to help to improve clustering performance. Finally, well-grounded statistical consistency properties of our proposed method are rigorously established, including the accurate estimation of parameters and determination of clustering structures.ConclusionsOur proposed method achieves better clustering performance than other methods in simulation studies, and the clustering accuracy increases with prior information incorporated. Meaningful biological findings are obtained in the analysis of lung adenocarcinoma with clinical imaging data and omics data, showing that hierarchical structure produced by rough and refined clustering is necessary and reasonable.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Clustering on hierarchical heterogeneous data with prior pairwise relationships

Abstract

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Journal: BMC Bioinformatics	Publication Date: Jan 23, 2024
License type: CC BY 4.0

Similar Papers

Partial Multi-View Clustering Based on Sparse Embedding Framework
Mingshu Ji ... Ning Wang
IEEE Access | VOL. 7
Mingshu Ji, et. al.Mingshu Ji ... Ning Wang
01 Jan 2019
IEEE Access | VOL. 7

Using Locality Preserving Projections to Improve the Performance of Kernel Clustering
Mengmeng Zhan ... Lin Wu
Neural Processing Letters | VOL. 52
Mengmeng Zhan, et. al.Mengmeng Zhan ... Lin Wu
20 May 2020
Neural Processing Letters | VOL. 52

Integrating Multidimensional Data for Clustering Analysis With Applications to Cancer Patient Data
Seyoung Park ... Hongyu Zhao
Journal of the American Statistical Association | VOL. 116
Seyoung Park, et. al.Seyoung Park ... Hongyu Zhao
19 Mar 2020
Journal of the American Statistical Association | VOL. 116

Block spectral clustering for multiple graphs with inter-relation
Chuan Chen ... Shuqin Zhang
Network Modeling Analysis in Health Informatics and Bioinformatics | VOL. 6
Chuan Chen, et. al.Chuan Chen ... Shuqin Zhang
26 Apr 2017
Network Modeling Analysis in Health Informatics and Bioinformatics | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Clustering on hierarchical heterogeneous data with prior pairwise relationships

Abstract

Talk to us

Similar Papers

More From: BMC Bioinformatics