Online group streaming feature selection holds significant research value in large-scale streaming data processing scenarios, and the related work based on rough set theory has attracted academic interest. Nevertheless, most relevant algorithms come with parameters, and performing a grid search for the optimal parameters reduces the efficiency. Moreover, the existing online group streaming feature selection frameworks cannot effectively fit the situation in practice. Focusing on these issues, this paper investigates an online group streaming feature selection method based on fuzzy neighborhood granular ball rough sets. First, Canopy clustering is introduced to granular ball computing, and the adaptive neighborhood of samples is generated based on the granular ball distribution. Second, we construct a fuzzy neighborhood granular ball rough set (FNGBRS) model and propose the integrated dependence degree to achieve maximal dependency and minimum classification error. Then, the purity of granular balls is considered as the weight of features, and some uncertainty measures based on FNGBRS are presented. Finally, we define a random factor to control the size of streaming groups and design an online group streaming feature selection algorithm. Comparative experimental results on sixteen public datasets demonstrate that the proposed algorithm exhibits superior and stable classification performance, coupled with increased efficiency from its parameter-free design.
Read full abstract