Abstract

AbstractOnline streaming feature selection (OSFS), as an online learning manner to handle streaming features, is critical in addressing high‐dimensional data. In real big data‐related applications, the patterns and distributions of streaming features constantly change over time due to dynamic data generation environments. However, existing OSFS methods rely on presented and fixed hyperparameters, which undoubtedly lead to poor selection performance when encountering dynamic features. To make up for the existing shortcomings, the authors propose a novel OSFS algorithm based on vague set, named OSFS‐Vague. Its main idea is to combine uncertainty and three‐way decision theories to improve feature selection from the traditional dichotomous method to the trichotomous method. OSFS‐Vague also improves the calculation method of correlation between features and labels. Moreover, OSFS‐Vague uses the distance correlation coefficient to classify streaming features into relevant features, weakly redundant features, and redundant features. Finally, the relevant features and weakly redundant features are filtered for an optimal feature set. To evaluate the proposed OSFS‐Vague, extensive empirical experiments have been conducted on 11 datasets. The results demonstrate that OSFS‐Vague outperforms six state‐of‐the‐art OSFS algorithms in terms of selection accuracy and computational efficiency.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.