Abstract

Some neural network can be trained by transfer learning, which uses a pre-trained neural network as the source task, for a small target task’s dataset. The performance of the transfer learning depends on the knowledge (i.e., layers) selected from the pre-trained network. At present, this knowledge is usually chosen by humans. The transfer learning method PathNet automatically selects pre-trained modules or adjustable modules in a modular neural network. However, PathNet requires modular neural networks as the pre-trained networks, therefore non-modular pre-trained neural networks are currently unavailable. Consequently, PathNet limits the versatility of the network structure. To address this limitation, we propose Stepwise PathNet, which regards the layers of a non-modular pre-trained neural network as the module in PathNet and selects the layers automatically through training. In an experimental validation of transfer learning from InceptionV3 pre-trained on the ImageNet dataset to networks trained on three other datasets (CIFAR-100, SVHN and Food-101), Stepwise PathNet was up to 8% and 10% more accurate than finely tuned and from-scratch approaches, respectively. Also, some of the selected layers were not supported by the layer functions assumed in PathNet.

Highlights

  • Some neural network can be trained by transfer learning, which uses a pre-trained neural network as the source task, for a small target task’s dataset

  • The transfer-learning performance of Stepwise PathNet using a convolutional neural network (CNN) was evaluated on three datasets under InceptionV319 pre-trained to ImageNet

  • We introduce an improved version of tournament selection algorithm (TSA) () for use in Stepwise PathNet

Read more

Summary

Introduction

Some neural network can be trained by transfer learning, which uses a pre-trained neural network as the source task, for a small target task’s dataset. PathNet limits the versatility of the network structure To address this limitation, we propose Stepwise PathNet, which regards the layers of a non-modular pre-trained neural network as the module in PathNet and selects the layers automatically through training. Some of the selected layers were not supported by the layer functions assumed in PathNet. A neural network is a machine learning method, and it requires a relatively large labeled training dataset. Transfer learning reduces the required size of the training dataset for the target task, which addresses this problem To this end, it exploits the knowledge gained by a pre-trained neural network. Pre-trained neural networks which PathNet uses must be a modular neural network, and a non-modular CNN is hard to be used even if the module supports a convolutional layer. The current PathNet is available for modular neural networks only, and needs to be extended to general neural network structures (such as CNNs)

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call