Computationally efficient training of deep neural networks via transfer learning

Diane Oyen,Matthias F Carlsohn

doi:10.1117/12.2519097

Abstract

Transfer learning is a highly successful approach to train deep neural networks for customized image recognition tasks. Transfer learning leverages deep learning models that are previously trained on massive datasets and re-trains the network for a novel image recognition dataset. Typically, the advantage of transfer learning has been measured in sample efficiency, but instead, we investigate the computational efficiency of transfer learning. A good pre-trained model provides features that can be used as input to a new classifier (usually the top layers of a neural network). We show that if a good pre-trained model is selected, that training a new classifier can be much more computationally efficient than training a deep neural network without transfer learning. The first step in transfer learning is to select a pre-trained model to use as a feed-forward network for generating features. This selection process either uses human intuition/convenience or a methodical but computationally intensive validation loop. Instead, we would prefer a method to select the pre-trained model that will produce the best transfer results for the least amount of computation. To this end, we provide a computationally efficient metric for the fit between a pre-trained model and a novel image recognition task. The better the fit, the less computation will be needed to re-train the pre-trained model for the novel task. As machine learning becomes ubiquitous, highly-accurate trained models will proliferate and computationally efficient transfer learning methods will enable rapid development of new image recognition models.

Full Text