Abstract

In order to increase the classification accuracy of KNN in high dimensional data, the paper presents a new method, RFVIM-KNN, which allows to achieve feature selection and feature importance weighting of high dimensional data. We applied the method to the classification of three high dimensional data sets and also compared it with other methods. Random forests provide variable importance measures that can be used to identify the most important predictor variables and indicate the predictive structure. The irrelevant features were eliminated according to feature importance scores measured by random forest variable importance measures. The features important scores was made a feature importance weighting in distance metric, which gives expression to the original predictive feature structure. The experimental results demonstrated that our proposed method approached both feature selection and feature importance weighting in the distance metric, leading to a significant improvement on the classification accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call