TNN: A transfer learning classifier based on weighted nearest neighbors

Haiyang Sheng,Guan Yu

doi:10.1016/j.jmva.2022.105126

Abstract

Weighted nearest neighbors (WNN) classifiers are popular non-parametric classifiers. Despite the significant progress in WNN, most existing WNN classifiers are designed for traditional supervised learning problems where both training samples and test samples are assumed to be independent and identically distributed. However, in many real applications, it could be difficult or expensive to obtain training samples from the distribution of interest. Therefore, data collected from some related distributions are often used as supplementary training data for the classification task under the distribution of interest. It is essential to develop effective classification methods that could incorporate both training samples from the distribution of interest (if they exist) and supplementary training samples from a different but related distribution. To address this challenge, we propose a novel Transfer learning weighted Nearest Neighbors (TNN) classifier. As a WNN classifier, TNN determines the weights on the class labels of training samples for different test samples adaptively by minimizing an upper bound on the conditional expectation of the estimation error of the regression function. It puts decreasing weights on the class labels of the successive more distant neighbors. To accommodate the difference between training samples from the distribution of interest and supplementary training samples, TNN adds a non-negative offset to the distance between each supplementary training sample and the test sample, and thus constrains the excessive influence of the supplementary training samples on the prediction. Our theoretical studies show that, under certain conditions, TNN is consistent and minimax optimal (up to a logarithmic factor) in the covariate shift setting. In the posterior drift or the more general setting where both covariate shift and posterior drift exist, the excess risk of TNN depends on the maximum posterior discrepancy between the distribution of the supplementary training samples and the distribution of interest. Both our simulation studies and an application to the land use/land cover mapping problem in geography demonstrate that TNN outperforms other existing methods. It can serve as an effective tool for transfer learning.

Full Text