A Unified Metric for Categorical and Numerical Attributes in Data Clustering

Yiu-Ming Cheung,Hong Jia

doi:10.1007/978-3-642-37456-2_12

A Unified Metric for Categorical and Numerical Attributes in Data Clustering

Yiu-Ming Cheung, Hong Jia

Open Access

PDF Available

https://doi.org/10.1007/978-3-642-37456-2_12

Copy DOI

Export

Save

Cite

Publication Date: Jan 1, 2013

Citations: 4

Affiliation: Hong Kong Baptist University, United International College, Beijing Normal University

#Categorical Attributes #Numerical Attributes #Categorical Data #Numerical Data #Similarity Metrics #Iterative Clustering Algorithm #Mixed Attributes #Benchmark Data Sets #Data Clustering #Concept Of Similarity

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Most of the existing clustering approaches are applicable to purely numerical or categorical data only, but not both. In general, it is a nontrivial task to perform clustering on mixed data composed of numerical and categorical attributes because there exists an awkward gap between the similarity metrics for categorical and numerical data. This paper therefore presents a general clustering framework based on the concept of object-cluster similarity and gives a unified similarity metric which can be simply applied to the data with categorical, numerical, or mixed attributes. Accordingly, an iterative clustering algorithm is developed, whose efficacy is experimentally demonstrated on different benchmark data sets.

Full Text

Submitted Version (Free)

View/Download pdf

Published Version

Check institute access

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.

R Discovery Prime

A Unified Metric for Categorical and Numerical Attributes in Data Clustering