Abstract

Due to the scarcity of readily available labels, unsupervised feature selection (UFS) methods are widely adopted in the analysis of high-dimensional data. However, most of the existing UFS methods primarily focus on the significance of features in maintaining the data structure while ignoring the redundancy among features. Moreover, the determination of the proper number of features is another challenge. In this paper, an efficient unsupervised feature selection method through feature clustering (EUFSFC) is proposed to address the redundancy among features, and to determine the size of the final feature subset. The proposed methodology is comprised of two steps: (a) feature cluster analysis, and (b) the selection of the representative features. An extended density-based clustering algorithm is proposed to separate features into an appropriate number of disjoint clusters with no requirement for predefined cluster numbers or radii. The selection of features is performed by choosing the most representative features from those feature clusters. Experiments are conducted to show the effectiveness of the proposed feature selection method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call