Abstract

This paper proposes a novel learning model that introduces the calculation of the pairwise gravitation of the selected patterns into the classical fixed radius nearest neighbor method, in order to overcome the drawback of the original nearest neighbor rule when dealing with imbalanced data. The traditional k nearest neighbor rule is considered to lose power on imbalanced datasets because the final decision might be dominated by the patterns from negative classes in spite of the distance measurements. Differently from the existing modified nearest neighbor learning model, the proposed method named GFRNN has a simple structure and thus becomes easy to work. Moreover, all parameters of GFRNN do not need initializing or coordinating during the whole learning procedure. In practice, GFRNN first selects patterns as candidates out of the training set under the fixed radius nearest neighbor rule, and then introduces the metric based on the modified law of gravitation in the physical world to measure the distance between the query pattern and each candidate. Finally, GFRNN makes the decision based on the sum of all the corresponding gravitational forces from the candidates on the query pattern. The experimental comparison validates both the effectiveness and the efficiency of GFRNN on forty imbalanced datasets, comparing to nine typical methods. As a conclusion, the contribution of this paper is constructing a new simple nearest neighbor architecture to deal with imbalanced classification effectively without any manually parameter coordination, and further expanding the family of the nearest neighbor based rules.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call