Poster abstract: MicroBrain: Compressing deep neural networks for energy-efficient visual inference service

Shiming Ge,Qiting Ye,Xiao-Yu Zhang,Zhao Luo

doi:10.1109/infcomw.2017.8116530

Abstract

The deployments of deep neural network models on mobile or embedded devices have been hindered due to their large number of weights. In this work, we develop a deep neural network (DNN) model compression service termed MicroBrain to reduce the resource usage for energy-efficient visual inference. By automatically analyzing the trained DNN models, we propose a high-performance DNN model compression approach to perform resource control via four modules. The proposed service, along with the compression approach, can provide 20–30x compression rate with negligible accuracy loss to condense DNN models, which facilitates their deployments on mobile devices for energy-efficient visual inference. We conduct an evaluation on two representative models, AlexNet and VGG-16, for object recognition and face verification tasks, which demonstrate the effectiveness of our proposed approach.

Full Text