INCORPORATING DENSITY IN K-NEAREST NEIGHBORS REGRESSION

Mohamed A Mahfouz

doi:10.26483/ijarcs.v14i3.6989

Abstract

The application of the traditional k-nearest neighbours in regression analysis suffers from several difficulties when only a limited number of samples are available. In this paper, two decision models based on density are proposed. In order to reduce testing time, a k-nearest neighbours table (kNN-Table) is maintained to keep the neighbours of each object x along with their weighted Manhattan distance to x and a binary vector representing the increase or the decrease in each dimension compared to x’s values. In the first decision model, if the unseen sample having a distance to one of its neighbours x less than the farthest neighbour of x’s neighbour then its label is estimated using linear interpolation otherwise linear extrapolation is used. In the second decision model, for each neighbour x of the unseen sample, the distance of the unseen sample to x and the binary vector are computed. Also, the set S of nearest neighbours of x are identified from the kNN-Table. For each sample in S, a normalized distance to the unseen sample is computed using the information stored in the kNN-Table and it is used to compute the weight of each neighbor of the neighbors of the unseen object. In the two models, a weighted average of the computed label for each neighbour is assigned to the unseen object. The diversity between the two proposed decision models and the traditional kNN regressor motivates us to develop an ensemble of the two proposed models along with traditional kNN regressor. The ensemble is evaluated and the results showed that the ensemble achieves significant increase in the performance compared to its base regressors and several related algorithms.

Full Text