Abstract

Data is growing at an exponential pace. To cope with this data explosion, we need effective data processing and analysis techniques. Feature selection is selecting a subset of features from a dataset that still provides most of the useful information. Various tools are available as underlying framework for this process however, Rough Set Theory is the most prominent tool due to its analysis friendly nature. Majority of Rough Set based feature selection algorithms use positive region based dependency measure as the sole criteria to select feature subset. Calculating positive region requires calculation of lower approximation which consequently involves indiscernibility relation. In this paper, new definitions of two Rough Set preliminaries i.e. lower and upper rough set approximation are proposed. New definitions of approximations are computationally less expensive as compared to the conventional. The proposed redefinitions showed 42.78% decrease in execution time for redefined lower approximation and 43.06% decrease in case of redefined upper approximation, for five publicly available datasets while maintaining 100% accuracy. Finally based on these redefined approximations we proposed a feature selection algorithm, which when compared with state of the art techniques showed significant increase in performance without the affecting the accuracy.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.