Teaching Where to See: Knowledge Distillation-Based Attentive Information Transfer in Vehicle Maker Classification

So Yeon Jo,Suk-Ju Kang,Jun Ho Heo,Yunsoo Lee,Namhyun Ahn

doi:10.1109/access.2019.2925198

So Yeon Jo, Suk-Ju Kang + Show 3 more

Open Access

https://doi.org/10.1109/access.2019.2925198

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2019
Citations: 19	License type: CC BY 4.0

Affiliation: Sogang University

Abstract

Deep neural networks (DNNs) have been applied to various fields and achieved high performances. However, they require significant computing resources because of their numerous parameters, even though some of those parameters are redundant and do not contribute to the DNN performance. Recently, to address this problem, many knowledge distillation-based methods have been proposed to compress a large DNN model into a small model. In this paper, we propose a novel knowledge distillation method that can compress a vehicle maker classification system based on a cascaded convolutional neural network (CNN) into a single CNN structure. The system uses mask regions with CNN features (Mask R-CNN) as a preprocessor for the vehicle region detection and has a structure to be used in conjunction with a CNN classifier. By the preprocessor, the classifier can receive the background-removed vehicle image, which allows the classifier to have more attention to the vehicle region. With this cascaded structure, the system can classify the vehicle makers at about 91% performance. Most of all, when we compress the system into a single CNN structure through the proposed knowledge distillation method, it demonstrates about 89% accuracy, in which only about 2% of the accuracy is lost. Our experimental results show that the proposed method is superior to the conventional knowledge distillation method in terms of performance transfer.

Highlights

Deep neural networks (DNNs) have garnered attention owing to their excellent performance in various fields such as computer vision, speech recognition, and big data
We propose a novel knowledge distillation method based on a feature map distance that can compress a convolutional neural network (CNN)-cascaded system into a single CNN structure, and demonstrate its possibility through experiments
The proposed system uses the coordinate information of the vehicle region provided by the Mask R-CNN to crop the input image and resize it according to the input size of the classifier

Summary

Introduction

Deep neural networks (DNNs) have garnered attention owing to their excellent performance in various fields such as computer vision, speech recognition, and big data. Hinton et al proposed a method to provide the teacher’s softmax output as a training criterion when considering an ensemble model as a teacher and a single network as a student. A CNN classifier trained with data such as (c) can extract most of the features in the vehicle region; it becomes clear that the case of learning unnecessary information is reduced significantly.

Results

Conclusion