A Three-step Method for Three-way Clustering by Similarity-based Sample’s Stability

Jin Zhu,Pingxin Wang,Dongqin Jiang,Jian Lin

doi:10.1155/2022/6555501

Jin Zhu, Pingxin Wang + Show 2 more

Open Access

https://doi.org/10.1155/2022/6555501

Copy DOI

Abstract

Clustering is an important research field in machine learning. Traditional clustering approaches are not very effective in dealing with clusters having overlapping regions. To better capture the three types of relationships between a cluster and a sample, namely, belong-to fully, belong-to partially and not belong-to fully, we propose a theory of similarity-based sample’s stability and develop a three-step method for three-way clustering by integrating similarity-based sample’s stability into the idea of three-way clustering in this paper. In the proposed theory, the similarity of two samples is used to define the frequencies of two samples and the samples stability is calculated based on the defined frequencies and determinacy function. With this stability, the universe is divided into stable set and unstable set. The samples in the stable set are assigned into the core region of each cluster by using traditional clustering algorithm. The samples in the unstable set are assigned into the fringe region of corresponding cluster according to distances between the elements and the centers of the cluster core regions. Therefore, a three-way clustering is naturally formed. Experimental results on datasets show that this method can improve the structure of the clustering results.

Highlights

Data clustering is one of the most fundamental topics for data exploration in machine learning and plays has an important role in many fields such as information granulation, image analysis, network structure analysis and others [1–4]. e purpose of clustering is to discover the underlying structure of a data set by organizing the samples in the data set into several clusters such that the objects within a cluster are highly similar but remarkably dissimilar with objects in other clusters [5]. many researchers have done a series of research on clustering problem in the past decades and various kinds of clustering algorithms have been developed in the literature, including partitional, hierarchical, densitybased and grid-based clustering and so on
The samples in the stable set are assigned into the core region of each cluster by using traditional clustering algorithm. e samples in the unstable set are assigned into the fringe region of corresponding cluster according to distances between the elements and the centers of the cluster core region. erefore, a three-way clustering is naturally formed
We use k-means algorithm to obtain the core region of each cluster. e samples in the unstable set are assigned to the fringe region of each cluster by local co-association coefficient corresponding to the discovered core regions. e whole process can be shown as Algorithm 4

Summary

Introduction

Data clustering is one of the most fundamental topics for data exploration in machine learning and plays has an important role in many fields such as information granulation, image analysis, network structure analysis and others [1–4]. e purpose of clustering is to discover the underlying structure of a data set by organizing the samples in the data set into several clusters such that the objects within a cluster are highly similar but remarkably dissimilar with objects in other clusters [5]. many researchers have done a series of research on clustering problem in the past decades and various kinds of clustering algorithms have been developed in the literature, including partitional, hierarchical, densitybased and grid-based clustering and so on. We develop a new three-way clustering algorithm by similarity-based sample’s stability. We use the similarity of two samples to define the co-association frequency and compute the Figure 2: A demonstration of three-way cluster. E samples in the unstable set are assigned into the fringe region of corresponding cluster according to distances between the elements and the centers of the cluster core region. Base on the similarity of two samples, a new definition of co-association frequencies is proposed and the relation between the sample’s stability is discussed. E similarity-based sample’s stability that is measured by the proposed method is verified by experiments on UCI data sets.

Similarity-Based Sample’s Stability

Step 1

Step 2

Step 3

Experimental Results

S1 Zoo Iris Wine Dermatology Segmentation

Concluding Remarks

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Mathematical Problems in Engineering	Publication Date: Feb 27, 2022
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Three-step Method for Three-way Clustering by Similarity-based Sample’s Stability

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematical Problems in Engineering

Lead the way for us

Similar Papers

Three-way clustering: Foundations, survey and challenges
Pingxin Wang ... Yiyu Yao
Applied Soft Computing Journal | VOL. 151
Pingxin Wang, et. al.Pingxin Wang ... Yiyu Yao
04 Dec 2023
Applied Soft Computing Journal | VOL. 151

A Three-Way Clustering Method Based on Ensemble Strategy and Three-Way Decision
Pingxin Wang ... Gang Xu
Information | VOL. 10
Pingxin Wang, et. al.Pingxin Wang ... Gang Xu
14 Feb 2019
Information | VOL. 10

W-Hash: A Novel Word Hash Clustering Algorithm for Large-Scale Chinese Short Text Analysis
Yaofeng Chen ... Meikang Qiu
-
Yaofeng Chen, et. al.Yaofeng Chen ... Meikang Qiu
01 Jan 2021
01 Jan 2021

Three-way clustering method for incomplete information system based on set-pair analysis
Chunying Zhang ... Xiaoze Feng
Granular Computing | VOL. 6
Chunying Zhang, et. al.Chunying Zhang ... Xiaoze Feng
07 Sep 2019
Granular Computing | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Three-step Method for Three-way Clustering by Similarity-based Sample’s Stability

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematical Problems in Engineering