Abstract

Feature selection methods are used to obtain relevant feature subset from the original feature space that is of high dimension for efficient classification and clustering of data. Most real world datasets are of multi-cluster nature with correlation amongst the features. This paper proposes a new method of multi-cluster feature selection, called Efficient Multi-Cluster Feature Selection (EMCFS). It obtains only the features that can best preserve the multiple cluster structure of the data. It employs the anchor graph to build the adjacency matrix of much reduced dimension than the feature space. The eigen vector values of the graph Laplacian model the underlying geometric structure of the data. The experimental result on TDT2 and Reuters-21578 text data set demonstrates the efficiency of the proposed method. A comparison of EMCFS with the original Multi-Cluster Feature Selection (MCFS) demonstrates its improved accuracy and reduced execution time, making it a promising method for real world high dimensional datasets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.