Abstract

Classification is a technique based on machine learning used to classify each item in a set of data into a set of predefined classes or group. It is widely used in medical field to classify the medical data. In producing better classification result, feature selection been applied in many of the classification work as part of preprocessing step, where a subset of feature been used rather than the whole features from particular dataset. Feature selection eliminates irrelevant attribute to obtain high quality features that may contribute in enhancing classification process and producing better classification results. This study is conducted with the intention to focus on feature selection techniques as a method that helps classifiers producing better classification performance with the most significant features. During the experiments, a comparison between benchmark feature selection methods based on three cancer datasets and four well recognized machine learning algorithms has been made. This paper then analyzes the performance of all classifiers with and without feature selection in term of ROC and F-Measure. The study found that although there are no single feature selection method can satisfy all datasets, the results still effectively support the fact that feature selection helps in increasing the classifier performance with existence of minimum number of features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call