Abstract

Most of the unsupervised feature selection methods employ pseudo labels generated by clustering to guide the feature selection; however, noisy and irrelevant features degrade the cluster structure, which is ineffective to supervise feature selection. In light of this, we propose the Consensus Guided Unsupervised Feature Selection (CGUFS) framework, which introduces consensus clustering to generate pseudo labels for feature selection. Generally speaking, multiple diverse basic partitions are generated from the data and the consensus clustering is employed to provide the high-quality and robust partition to guide the feature selection in a one-step framework. In addition, complex constraints such as non-negative are removed due to the crisp indicators of consensus clustering. Based on the CGUFS framework, two formulations are put forward by using the utility function and co-association matrix, respectively, and we propose the (weighted) K-means-like optimization solution for efficient solutions with theoretical supports. Moreover, we extend the CGUFS framework to handle multi-view data feature selection. Extensive experiments on several single-view and multi-view data mining data sets in different domains demonstrate that our methods outperform the most recent state-of-the-art work in terms of effectiveness and efficiency. Some important impact factors and model parameters within CGUFS are thoroughly discussed for practical use.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.