Graph-Based Dissimilarity Measurement for Cluster Analysis of Any-Type-Attributed Data.

Yiqun Zhang,Yiu-Ming Cheung

doi:10.1109/tnnls.2022.3202700

Abstract

Heterogeneous attribute data composed of attributes with different types of values are quite common in a variety of real-world applications. As data annotation is usually expensive, clustering has provided a promising way for processing unlabeled data, where the adopted similarity measure plays a key role in determining the clustering accuracy. However, it is a very challenging task to appropriately define the similarity between data objects with heterogeneous attributes because the values from heterogeneous attributes are generally with very different characteristics. Specifically, numerical attributes are with quantitative values, while categorical attributes are with qualitative values. Furthermore, categorical attributes can be categorized into nominal and ordinal ones according to the order information of their values. To circumvent the awkward gap among the heterogeneous attributes, this article will propose a new dissimilarity metric for cluster analysis of such data. We first study the connections among the heterogeneous attributes and build graph representations for them. Then, a metric is proposed, which computes the dissimilarities between attribute values under the guidance of the graph structures. Finally, we develop a new k -means-type clustering algorithm associated with this proposed metric. It turns out that the proposed method is competent to perform cluster analysis of datasets composed of an arbitrary combination of numerical, nominal, and ordinal attributes. Experimental results show its efficacy in comparison with its counterparts.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Neural Networks and Learning Systems	Publication Date: Sep 1, 2023
Citations: 13	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Graph-Based Dissimilarity Measurement for Cluster Analysis of Any-Type-Attributed Data.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Neural Networks and Learning Systems

Lead the way for us

Similar Papers

Locality-Sensitive Hashing for Data with Categorical and Numerical Attributes Using Dual Hashing
Keon Myung Lee
International Journal of Fuzzy Logic and Intelligent Systems | VOL. 14
Keon Myung LeeKeon Myung Lee
30 Jun 2014
International Journal of Fuzzy Logic and Intelligent Systems | VOL. 14

PerRank: Personalized Rank Retrieval with Categorical and Numerical Attributes
Sangkyum Kim ... Younhee Ko
-
Sangkyum Kim, et. al.Sangkyum Kim ... Younhee Ko
01 Jul 2008
01 Jul 2008

A Unified Metric for Categorical and Numerical Attributes in Data Clustering
Yiu-Ming Cheung ... Hong Jia
-
Yiu-Ming Cheung, et. al.Yiu-Ming Cheung ... Hong Jia
01 Jan 2013
01 Jan 2013

Clustering algorithm for mixed datasets using density peaks and Self-Organizing Generative Adversarial Networks
K Balaji ... A Geetha Mary
Chemometrics and Intelligent Laboratory Systems | VOL. 203
K Balaji, et. al.K Balaji ... A Geetha Mary
05 Jun 2020
Chemometrics and Intelligent Laboratory Systems | VOL. 203

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Graph-Based Dissimilarity Measurement for Cluster Analysis of Any-Type-Attributed Data.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Neural Networks and Learning Systems