Abstract

Feature selection techniques can be evaluated based on either model performance or the stability (robustness) of the technique. The ideal situation is to choose a feature selection technique that is robust to change, while also ensuring that models built with the selected features perform well. One domain where feature selection is especially important is software defect prediction, where large numbers of metrics collected from previous software projects are used to help engineers focus their efforts on the most faulty modules. This study presents a comprehensive empirical examination of seven filter-based feature ranking techniques (rankers) applied to nine real-world software measurement datasets of different sizes. Experimental results demonstrate that signal-to-noise ranker performed moderately in terms of robustness and was the best ranker in terms of model performance. The study also shows that although Relief was the most stable feature selection technique, it performed significantly worse than other rankers in terms of model performance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.