Unsupervised Feature Selection and Clustering Optimization Based on Improved Differential Evolution

Tao Li,Hongbin Dong

doi:10.1109/access.2019.2937739

Abstract

The feature selection method based on supervised learning has been widely studied and applied to the field of machine learning and data mining. But unsupervised feature selection is still a tricky area of research because the unavailability of the label information, especially for clustering tasks. Irrelevant features and redundant features in the original data seriously block the discovery of clustering structure and weaken the performance of the subsequent classification. In order to address this problem, the unsupervised feature selection and clustering algorithm based on the evolutionary computing framework is proposed in this paper. First, the binary differential evolution algorithm is constructed for unsupervised feature selection. Specifically, the individuals of the population are used to characterize the feature subspaces and the improved Laplacian model is designed to measure the local manifold structure of each individual. Subsequently, the approximate optimal manifold structure and the corresponding feature subset are obtained. Then, the continuous differential evolutionary algorithm is executed on the optimized feature subset, in which the individual representation strategy and the integrated individual measure function are designed for clustering. Moreover, the predicted pseudo-labels are utilized to classify and further verify the validity of clustering. The experimental results demonstrate that the proposed framework outperforms the most state-of-the-art methods.

Highlights

Nowadays, the phenomenon of high dimensionality has become increasingly prominent in the real world applications
The work of the paper focuses on two parts: unsupervised feature selection based on discrete difference evolution (UFDDE) and clustering algorithm based on continuous differential evolution (CCDE)
PROPOSED EVOLUTIONARY CLUSTERING ALGORITHM In order to verify the ability of the feature subspace selected by the UFDDE to characterize the original data structure, an adaptive clustering algorithm based on continuous differential evolution (CCDE) is proposed in the paper

Summary

INTRODUCTION

The phenomenon of high dimensionality has become increasingly prominent in the real world applications. Dong: Unsupervised Feature Selection and Clustering Optimization Based on Improved Differential Evolution is widely used in feature selection because of its good global search ability [6]. In the unsupervised feature selection, most of the current algorithms adopt the transformation method to map the original high-dimensional space to the new low-dimensional space to achieve the purpose of dimensionality reduction, which makes the obtained feature subset lose the original physical meaning of the original data set and reduces the interpretability of the learning model. Some unsupervised feature selection algorithms analyze the clustering performance according to the size of feature subset, and the learning model is less adaptive. In response to the above issues, a framework for unsupervised feature selection and clustering based on improved differential evolution (UFSCDE) is proposed. The basic steps of the K -means algorithm are presented as follow: 1 Select K samples randomly from the original data set as the initial cluster center; 2 Calculate the distance between the remaining samples and the K cluster centers, and divide the samples into the nearest cluster center; 3 recalculate the centers of the K clusters; 4 Repeat 2 and 3 until the center of the cluster is unchanged or reaches the certain number of iterations and the fault tolerance

PROPOSED UNSUPERVISED FEATURE SELECTION ALGORITHM

CROSSOVER OPERATOR BASED ON FITNESS VALUE

INDIVIDUAL SELECTION OPERATION

8: Compute similarity matrix by Z

PROPOSED EVOLUTIONARY CLUSTERING ALGORITHM

Findings

CONCLUSION

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE access : practical innovations, open solutions	Publication Date: Jan 1, 2019
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Unsupervised Feature Selection and Clustering Optimization Based on Improved Differential Evolution

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions

Lead the way for us

Similar Papers

Detecting Local Manifold Structure for Unsupervised Feature Selection
Ding-Cheng Feng ... Wen-Li Xu
Acta Automatica Sinica | VOL. 40
Ding-Cheng Feng, et. al.Ding-Cheng Feng ... Wen-Li Xu
01 Oct 2014
Acta Automatica Sinica | VOL. 40

Utilizing Noun-Verb Extraction in Enhancing Information Retrieval

Turkish Journal of Computer and Mathematics Education (TURCOMAT) | VOL. 12

24 Apr 2021
Turkish Journal of Computer and Mathematics Education (TURCOMAT) | VOL. 12

Unsupervised Feature Selection With Ordinal Preserving Self-Representation
Jiangyan Dai ... Lei Wang
IEEE access : practical innovations, open solutions | VOL. 6
Jiangyan Dai, et. al.Jiangyan Dai ... Lei Wang
01 Jan 2018
IEEE access : practical innovations, open solutions | VOL. 6

Redundant features removal for unsupervised spectral feature selection algorithms: an empirical study based on nonparametric sparse feature graph
Pengfei Xu ... Shuchu Han
International journal of data science and analytics | VOL. 8
Pengfei Xu, et. al.Pengfei Xu ... Shuchu Han
04 Dec 2018
International journal of data science and analytics | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Unsupervised Feature Selection and Clustering Optimization Based on Improved Differential Evolution

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions