Layer Removal for Transfer Learning with Deep Convolutional Neural Networks

Weiming Zhi,Seid Miad Zandavi,Henry Wing Fung Yueng,Yuk Ying Chung,Zhicheng Lu,Zhenghao Chen

doi:10.1007/978-3-319-70096-0_48

Abstract

It is usually difficult to find datasets of sufficient size to train Deep Convolutional Neural Networks (DCNNs) from scratch. In practice, a neural network is often pre-trained on a very large source dataset. Then, a target dataset is transferred onto the neural network. This approach is a form of transfer learning, and allows very deep networks to achieve outstanding performance even when a small target dataset is available. It is thought that the bottom layers of the pre-trained network contain general information, which are applicable to different datasets and tasks, while the upper layers of the pre-trained network contain abstract information relevant to a specific dataset and task. While studies have been conducted on the fine-tuning of these layers, the removal of these layers have not yet been considered. This paper explores the effect of removing the upper convolutional layers of a pre-trained network. We empirically investigated whether removing upper layers of a deep pre-trained network can improve performance for transfer learning. We found that removing upper pre-trained layers gives a significant boost in performance, but the ideal number of layers to remove depends on the dataset. We suggest removing pre-trained convolutional layers when applying transfer learning on off-the-shelf pre-trained DCNNs. The ideal number of layers to remove will depend on the dataset, and remain as a parameter to be tuned.

Full Text