Abstract

Mining the big data is a challenging task due to the size of the databases and the complexity in maintaining precise and non-redundant data. Classification algorithms need to analyse hundreds of independent features in these high dimensional databases for effective prediction. The performance of classification algorithms could be enhanced data if irrelevant and redundant data are removed. Feature selection algorithms help in identifying prominent features that could enhance the performance of the classifier. Additionally, the classification performance of support vector machine SVM could be enhanced by setting appropriate kernel parameters. The kernel parameters of SVM are tuned for each feature subset generated by feature selection and the performance is analysed. The feature subset that enhances the classification performance of SVM is the optimal feature subset of the dataset. Experiments are done on three medical datasets. The empirical results prove that integrating feature selection and optimising the kernel parameters enhance the performance of the SVM classifier. The approach is validated in terms of increase in accuracy and area under receiver operating characteristic AUC of the classifier.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call