Abstract

Transfer learning enables to re-use knowledge learned on a source task to help learning a target task. A simple form of transfer learning is common in current state-of-the-art computer vision models, i.e., pre-training a model for image classification on the ILSVRC dataset, and then fine-tune on any target task. However, previous systematic studies of transfer learning have been limited and the circumstances in which it is expected to work are not fully understood. In this paper we carry out an extensive experimental exploration of transfer learning across vastly different image domains (consumer photos, autonomous driving, aerial imagery, underwater, indoor scenes, synthetic, close-ups) and task types (semantic segmentation, object detection, depth estimation, keypoint detection). Importantly, these are all complex, structured output tasks types relevant to modern computer vision applications. In total we carry out over 2000 transfer learning experiments, including many where the source and target come from different image domains, task types, or both. We systematically analyze these experiments to understand the impact of image domain, task type, and dataset size on transfer learning performance. Our study leads to several insights and concrete recommendations: (1) for most tasks there exists a source which significantly outperforms ILSVRC'12 pre-training; (2) the image domain is the most important factor for achieving positive transfer; (3) the source dataset should include the image domain of the target dataset to achieve best results; (4) at the same time, we observe only small negative effects when the image domain of the source task is much broader than that of the target; (5) transfer across task types can be beneficial, but its success is heavily dependent on both the source and target task types.

Highlights

  • T ranfer learning is omnipresent in computer vision

  • We study transfer learning at scale across three factors of influence: the difference in image domain between source and target tasks (Sect. 3.2), their difference in task type (Sect. 3.3), and the size of the source and target training sets (Sect. 3.4)

  • This paper explores transfer learning from a source task to a target task

Read more

Summary

Introduction

The common practice is transfer learning through ILSVRC’12 pre-training: train on the ILSVRC’12 image classification task [4], copy the resulting weights to a target model, fine-tune for the target task at hand. This strategy was shown to be effective on a wide variety of datasets and task types, including image classification [7], [17], [37], [42], [53], object detection [35], semantic segmentation [85], human pose estimation [38], [118], [121], and depth estimation [16], [29]. Most previous work focuses solely on image classification [23], [33], [55], [61], [72], [82],

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call