Cooperation of Mobile Devices for Fast Inference of Deep Learning Applications

Qinglin Yang,Toshiaki Miyazaki,Xiaofei Luo,Weiqin Tong,Peng Li,Wenfeng Shen

doi:10.1007/s11036-019-01345-0

Abstract

Deep learning stimulates many novel mobile applications, but it is still challenging to enable efficient mobile deep learning applications. Traditional approach tackles this challenge by offloading computation tasks to cloud, which has weaknesses of high bandwidth requirements and long transmission latency. In this paper, we propose to enable collaborative inference among mobile devices. Instead of sending deep learning inference tasks to cloud, we let mobile devices collaboratively share the computation workloads. This is based on an important observation that batching inference tasks on GPUs can accelerate the inference processing speed. To achieve efficient collaboration, we design an algorithm based on partial swarm optimization (PSO) that is a versatile population-based stochastic optimization technique. We also design a distributed algorithm to address the challenge that is difficult to collect global network information and run the centralized algorithm. Moreover, extensive simulations are conducted to evaluate the performance of the designed algorithm. The simulation results show that the collaborative inference scheme can effectively reduce inference time of mobile deep learning applications.

Full Text