Genie in the Model

Yanfei Wang,Zhiwen Yu,Bin Guo,Zimu Zhou,Sicong Liu

doi:10.1145/3580815

Abstract

Advances in deep neural networks (DNNs) have fostered a wide spectrum of intelligent mobile applications ranging from voice assistants on smartphones to augmented reality with smart-glasses. To deliver high-quality services, these DNNs should operate on resource-constrained mobile platforms and yield consistent performance in open environments. However, DNNs are notoriously resource-intensive, and often suffer from performance degradation in real-world deployments. Existing research strives to optimize the resource-performance trade-off of DNNs by compressing the model without notably compromising its inference accuracy. Accordingly, the accuracy of these compressed DNNs is bounded by the original ones, leading to more severe accuracy drop in challenging yet common scenarios such as low-resolution, small-size, and motion-blur. In this paper, we propose to push forward the frontiers of the DNN performance-resource trade-off by introducing human intelligence as a new design dimension. To this end, we explore human-in-the-loop DNNs (H-DNNs) and their automatic performance-resource optimization. We present H-Gen, an automatic H-DNN compression framework that incorporates human participation as a new hyperparameter for accurate and efficient DNN generation. It involves novel hyperparameter formulation, metric calculation, and search strategy in the context of automatic H-DNN generation. We also propose human participation mechanisms for three common DNN architectures to showcase the feasibility of H-Gen. Extensive experiments on twelve categories of challenging samples with three common DNN structures demonstrate the superiority of H-Gen in terms of the overall trade-off between performance (accuracy, latency), and resource (storage, energy, human labour).

Full Text