Abstract

Feature selection is considered as a necessary and significant pre-processing step in many fields, especially in machine learning. However, in some real problems, in which features flow one by one, many existing approaches do not work well on the online streaming features, and most online streaming feature selection (OSFS) methods face the challenge of requiring domain knowledge before setting optimal parameters in advance. Therefore, an effective feature selection method for online streaming features, named OFS-Gapknn, is proposed in this paper. A new neighborhood rough set relation is firstly defined, which combines the advantages of both the k-nearest and the Gap neighborhood. The proposed neighborhood relation can not only work well on the unevenly distributed sample space, but also need not any parameters and domain knowledge. Then, the relevance and redundancy features are analyzed by using the dependency based on the neighborhood rough set. Finally, one of the optimal feature subsets is obtained. To validate the effectiveness of the proposed algorithm, four traditional methods and three OSFS methods are compared with it on 11 datasets. Experimental results indicate the dominance and significance of the proposed method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.