Abstract

Multiscale brings great benefits for people to observe objects or problems from different perspectives. Multiscale clustering has been widely studied in various disciplines. However, most of the research studies are only for the numerical dataset, which is a lack of research on the clustering of nominal dataset, especially the data are nonindependent and identically distributed (Non-IID). Aiming at the current research situation, this paper proposes a multiscale clustering framework based on Non-IID nominal data. Firstly, the benchmark-scale dataset is clustered based on coupled metric similarity measure. Secondly, it is proposed to transform the clustering results from benchmark scale to target scale that the two algorithms are named upscaling based on single chain and downscaling based on Lanczos kernel, respectively. Finally, experiments are performed using five public datasets and one real dataset of the Hebei province of China. The results showed that the method can provide us not only competitive performance but also reduce computational cost.

Highlights

  • A Multiscale Clustering Approach for nonindependent and identically distributed (Non-IID) Nominal DataReceived 29 August 2021; Revised 13 September 2021; Accepted 15 September 2021; Published 11 October 2021

  • Clustering is one of the vital data mining and machine learning techniques, which aims to group similar objects into the same cluster and separate dissimilar objects into different clusters [1]

  • In this paper, inspired by the idea of condensed hierarchical clustering, an upscaling algorithm based on Couple metric similarity (CMS) (UACMS) is proposed

Read more

Summary

A Multiscale Clustering Approach for Non-IID Nominal Data

Received 29 August 2021; Revised 13 September 2021; Accepted 15 September 2021; Published 11 October 2021. Multiscale clustering has been widely studied in various disciplines. Most of the research studies are only for the numerical dataset, which is a lack of research on the clustering of nominal dataset, especially the data are nonindependent and identically distributed (Non-IID). Aiming at the current research situation, this paper proposes a multiscale clustering framework based on Non-IID nominal data. The benchmark-scale dataset is clustered based on coupled metric similarity measure. It is proposed to transform the clustering results from benchmark scale to target scale that the two algorithms are named upscaling based on single chain and downscaling based on Lanczos kernel, respectively. Experiments are performed using five public datasets and one real dataset of the Hebei province of China. E results showed that the method can provide us competitive performance and reduce computational cost Experiments are performed using five public datasets and one real dataset of the Hebei province of China. e results showed that the method can provide us competitive performance and reduce computational cost

Introduction
Related Work
Preliminaries
Performance Evaluations
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call