Adaptive Deep Neural Network Ensemble for Inference-as-a-Service on Edge Computing Platforms

Mohamed Abdel-Mottaleb,Yang Bai,Letian Zhang,Lixing Chen,Jie Xu

doi:10.1109/mass52906.2021.00013

Abstract

The momentous enabling of deep learning (DL)-powered mobile application is posing a soaring demand for computing resources that can hardly be satisfied by mobile devices. In this paper, we employ Edge Computing to deliver DL inference services to mobile users, where Deep Neural Networks (DNNs) are configured on edge servers, processing inference tasks received from mobile devices. A novel method called Adaptive DNN Ensemble (ADE) is proposed to enhance the performance of DL inference services. The core of ADE is the DNN ensemble technique which improves the stability and accuracy of DL inference. Due to the limited computing resources and service response deadline, ADE needs to judiciously determine DNNs to be included in the DNN ensemble, which poses a unique DNN ensemble selection problem. In addition, because DNNs exhibit performance variations for tasks with different features, DNN ensemble selection also aims to reconFigure DNN ensembles according to the feature of admitted tasks. We design an online learning algorithm, Contextual Combinatorial Multi-Armed Bandit (CC-MAB), to learn the DNN performance for tasks with different features. We rigorously prove that the proposed online learning algorithm is able to achieve asymptotic optimality. Experiments are carried out on an edge computing testbed to evaluate our method. Various implementation concerns, including memory usage, time complexity, and DNN switching cost, are considered. The results show that ADE outperforms other benchmarks in terms of inference accuracy and can provide real-time responses.

Full Text