Abstract

The deployments of deep neural network models on mobile or embedded devices have been hindered due to their large number of weights. In this work, we develop a deep neural network (DNN) model compression service termed MicroBrain to reduce the resource usage for energy-efficient visual inference. By automatically analyzing the trained DNN models, we propose a high-performance DNN model compression approach to perform resource control via four modules. The proposed service, along with the compression approach, can provide 20–30x compression rate with negligible accuracy loss to condense DNN models, which facilitates their deployments on mobile devices for energy-efficient visual inference. We conduct an evaluation on two representative models, AlexNet and VGG-16, for object recognition and face verification tasks, which demonstrate the effectiveness of our proposed approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call