In recent years, vehicular networks have seen a proliferation of applications and services such as image tagging, lane detection, and speech recognition. Many of these applications rely on Deep Neural Networks (DNNs) and demand low-latency computation. To meet these requirements, Vehicular Edge Computing (VEC) has been introduced to augment the abundant computation capacity of vehicular networks to complement limited computation resources on vehicles. Nevertheless, offloading DNN tasks to MEC (Multi-access Edge Computing) servers effectively and efficiently remains a challenging topic due to the dynamic nature of vehicular mobility and varying loads on the servers. In this paper, we propose a novel and efficient distributed DNN Partitioning And Offloading (DPAO), leveraging the mobility of vehicles and the synergy between vehicle–edge and edge–edge computing. We exploit the variations in both computation time and output data size across different layers of DNN to make optimized decisions for accelerating DNN computations while reducing the transmission time of intermediate data. In the meantime, we dynamically partition and offload tasks between MEC servers based on their load differences. We have conducted extensive simulations and testbed experiments to demonstrate the effectiveness of DPAO. The evaluation results show that, compared to offloaded all tasks to MEC server, DPAO reduces the latency of DNN tasks by 2.4x. DPAO with queue reservation can further reduce the task average completion time by 10%.
Read full abstract