Abstract

A deep-learning technology for knowledge transfer is necessary to advance and optimize efficient knowledge distillation. Here, we aim to develop a new adversarial optimization-based knowledge transfer method involved with a layer-wise dense flow that is distilled from a pre-trained deep neural network (DNN). Knowledge distillation transferred to another target DNN based on adversarial loss functions has multiple flow-based knowledge items that are densely extracted by overlapping them from a pre-trained DNN to enhance the existing knowledge. We propose a semi-supervised learning-based knowledge transfer with multiple items of dense flow-based knowledge extracted from the pre-trained DNN. The proposed loss function would comprise a supervised cross-entropy loss for a typical classification, an adversarial training loss for the target DNN and discriminators, and Euclidean distance-based loss in terms of dense flow. For both pre-trained and target DNNs considered in this study, we adopt a residual network (ResNet) architecture. We propose methods of (1) the adversarial-based knowledge optimization, (2) the extended and flow-based knowledge transfer scheme, and (3) the combined layer-wise dense flow in an adversarial network. The results show that it provides higher accuracy performance in the improved target ResNet compared to the prior knowledge transfer methods.

Highlights

  • In the past few years, as deep-learning technology has advanced dramatically, state-ofthe-art deep neural network (DNN) models find applications in several fields, ranging from computer vision to natural language processing [1,2,3,4,5,6,7,8,9,10]

  • We propose a layer-wise dense flow (LDF)-based knowledge transfer technique coupled with an adversarial network to generate low complexity DNN models with high accuracy performance that can be adaptively applied to target domains with limited computing resources

  • The sequential training involves repetitive sequential knowledge transfer of dense flow from bottom to top between pre-trained and target DNNs, whereas the concurrent approach involves the simultaneous transmission of dense flow into a target DNN

Read more

Summary

Introduction

In the past few years, as deep-learning technology has advanced dramatically, state-ofthe-art deep neural network (DNN) models find applications in several fields, ranging from computer vision to natural language processing [1,2,3,4,5,6,7,8,9,10]. Generally, top-performing DNNs have deep and wide neural network architectures with enormous parameters, significantly increasing the training time at high computational costs. Transfer learning [20] can be a reasonable candidate to address this limitation. This is because it leverages the knowledge gained from solving a task when applied to other similar tasks. When the wide and deep DNNs are successfully trained, they usually contain a wealth of knowledge within the learning parameters

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call