Rough set theory is a powerful mathematical framework for managing uncertainty and is widely utilized in feature selection. However, traditional rough set-based feature selection algorithms encounter significant challenges, especially when processing large-scale incremental data and adapting to the dynamic nature of real-world scenarios, where both data volume and feature sets are continuously changing. To overcome these limitations, this study proposes an innovative algorithm that integrates local neighborhood rough sets with composite entropy to measure uncertainty in information systems more accurately. By incorporating decision distribution, composite entropy enhances the precision of uncertainty quantification, thereby improving the effectiveness of the algorithm in feature selection. To further improve performance in handling large-scale incremental data, matrix operations are employed in place of traditional set-based methods, allowing the algorithm to fully utilize modern hardware capabilities for accelerated processing. Additionally, parallel computing technology is integrated to further enhance computational speed. An incremental version of the algorithm is also introduced to better adapt to dynamic data environments, increasing its flexibility and practicality. Comprehensive experimental evaluations demonstrate that the proposed algorithm significantly surpasses existing methods in both effectiveness and efficiency.
Read full abstract