Abstract
Multi-label feature selection has grabbed intensive attention in many big data applications. However, traditional multi-label feature selection methods generally ignore a real-world scenario, i.e., the features constantly flow into the model one by one over time. To address this problem, we develop a novel online multi-label streaming feature selection method based on neighborhood rough set to select a feature subset which contains strongly relevant and non-redundant features. The main motivation is that data mining based on neighborhood rough set does not require any priori knowledge of the feature space structure. Moreover, neighborhood rough set deals with mixed data without breaking the neighborhood and order structure of data. In this paper, we first introduce the maximum-nearest-neighbor of instance to granulate all instances which can solve the problem of granularity selection in neighborhood rough set, and then generalize neighborhood rough set in single-label to fit multi-label learning. Meanwhile, an online multi-label streaming feature selection framework, which includes online importance selection and online redundancy update, is presented. Under this framework, we propose a criterion to select the important features relative to the currently selected features, and design a bound on pairwise correlations between features under label set to filter out redundant features. An empirical study using a series of benchmark datasets demonstrates that the proposed method outperforms other state-of-the-art multi-label feature selection methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.