Rethinking Importance Weighting for Transfer Learning

Tianyi Zhang,Masashi Sugiyama,Takeshi Teshima,Tongtong Fang,Nan Lu

doi:10.1007/978-3-031-11748-0_9

Abstract

AbstractA key assumption in supervised learning is that training and test data follow the same probability distribution. However, this fundamental assumption is not always satisfied in practice, e.g., due to changing environments, sample selection bias, privacy concerns, or high labeling costs. Transfer learning (TL) relaxes this assumption and allows us to learn under distribution shift. Classical TL methods typically rely on importance weighting—a predictor is trained based on the training losses weighted according to the importance (i.e., the test-over-training density ratio). However, as real-world machine learning tasks are becoming increasingly complex, high-dimensional, and dynamical, novel approaches are explored to cope with such challenges recently. In this chapter, after introducing the foundation of TL based on importance weighting, we review recent advances on joint and dynamic importance-predictor estimation. Furthermore, we introduce a method of causal mechanism transfer that incorporates causal structure in TL. Finally, we discuss future perspectives of TL research.

Full Text