Abstract

The application of the traditional k-nearest neighbours in regression analysis suffers from several difficulties when only a limited number of samples are available. In this paper, two decision models based on density are proposed. In order to reduce testing time, a k-nearest neighbours table (kNN-Table) is maintained to keep the neighbours of each object x along with their weighted Manhattan distance to x and a binary vector representing the increase or the decrease in each dimension compared to x’s values. In the first decision model, if the unseen sample having a distance to one of its neighbours x less than the farthest neighbour of x’s neighbour then its label is estimated using linear interpolation otherwise linear extrapolation is used. In the second decision model, for each neighbour x of the unseen sample, the distance of the unseen sample to x and the binary vector are computed. Also, the set S of nearest neighbours of x are identified from the kNN-Table. For each sample in S, a normalized distance to the unseen sample is computed using the information stored in the kNN-Table and it is used to compute the weight of each neighbor of the neighbors of the unseen object. In the two models, a weighted average of the computed label for each neighbour is assigned to the unseen object. The diversity between the two proposed decision models and the traditional kNN regressor motivates us to develop an ensemble of the two proposed models along with traditional kNN regressor. The ensemble is evaluated and the results showed that the ensemble achieves significant increase in the performance compared to its base regressors and several related algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.