An Energy-Efficient Method for Recurrent Neural Network Inference in Edge Cloud Computing

Chao Chen,Guannan Li,Weiyu Guo,Yongkui Yang,Zheng Wang,Zhuoyu Wu

doi:10.3390/sym14122524

Abstract

Recurrent neural networks (RNNs) are widely used to process sequence-related tasks such as natural language processing. Edge cloud computing systems are in an asymmetric structure, where task managers allocate tasks to the asymmetric edge and cloud computing systems based on computation requirements. In such a computing system, cloud servers have no energy limitations, since they have unlimited energy resources. Edge computing systems, however, are resource-constrained, and the energy consumption is thus expensive, which requires an energy-efficient method for RNN job processing. In this paper, we propose a low-overhead, energy-aware runtime manager to process tasks in edge cloud computing. The RNN task latency is defined as the quality of service (QoS) requirement. Based on the QoS requirements, the runtime manager dynamically assigns RNN inference tasks to edge and cloud computing systems and performs energy optimization on edge systems using dynamic voltage and frequency scaling (DVFS) techniques. Experimental results on a real edge cloud system indicate that in edge systems, our method can reduce the energy up to 45% compared with the state-of-the-art approach.

Full Text