Feature Selection: Filter Methods Performance Challenges

Marianne Cherrington,Fadi Thabtah,Qiang Xu,Joan Lu

doi:10.1109/iccisci.2019.8716478

Marianne Cherrington, Fadi Thabtah + Show 2 more

https://doi.org/10.1109/iccisci.2019.8716478

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Learning is the heart of intelligence. The focus in machine learning is to automate methods that achieve objectives, improve predictions or encourage informed behavior. Feature selection is a vital step in data analysis that often reduces dataset dimensionality by eliminating irrelevant and/or redundant attributes to simplify the learning process or improve outcomes’ quality. This research critically analyses different filter methods based on ranking procedures (Information Gain (IG), Chi-square (CHI), V-score, Fisher Score, mRMR, Va and ReliefF) and identifies possible challenges that arise. We particularly concentrate on how threshold determination can affect results of different filter methods based on ranked scores. We show that this issue is vital, especially in the era of big data in which users deal with attributes in the magnitudes of tens of thousands with only a limited number of instances.

Full Text