Abstract

We may not know the entire feature space in advance for real-world applications, and features can exist in a stream mode, called streaming features. Online streaming feature selection aims to select optimal streaming features on the fly and can be summarized into three main components: irrelevant feature discarding, relevant feature selecting, and redundant feature removing. Therefore, the core issue of the streaming feature selection framework is the calculation of the relationship between features. This paper applies Rough Set models to discover the feature relationships for the most crucial advantages: they do not require any domain knowledge and can measure the selected features as integral. After the formal definitions of feature relevance, irrelevance, and redundancy from the Rough Set perspective, we analyze and abstract the feature relationship calculation from three levels: Rough Set model, positive region, and consistency calculation. Then we design a novel general assembly Rough Set based Streaming Feature Selection Framework, named RS-SFSF, which could assemble new algorithms for different problems step by step. Researchers in different areas can quickly build the algorithms they need based on our new framework. To demonstrate the effectiveness of RS-SFSF, we derived four new algorithms based on RS-SFSF by using the classical Rough Set model, neighborhood Rough Set model, and fuzzy Rough Set model, respectively. Extensive experiments conducted on twelve real-world datasets indicate the efficiency of our new framework.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call