Mobile edge computing has been a promising solution to enable real-time service in vehicular networks. However, due to high dynamics of mobile environment and heterogeneous features of vehicular services, traditional expert-based or learning-based strategies has to update handcrafted parameters or retrain learning model, which leads to intolerant overhead. Therefore, this paper investigates the problem of multi-task offloading (MTO), where there exist multiple offloading scenarios with varying parameters, such as task topology, resource requirement and transmission/computation capability. The objective is to design a unified solution to minimize task execution time under different MTO scenarios. Accordingly, we develop a Seq2seq-based Meta Reinforcement Learning algorithm for MTO (SMRL-MTO). Specifically, a bidirectional gated recurrent units integrated with attention mechanism is designed to determine offloading action by encoding sequential offloading actions and showing different preferences to different parts of input sequence. Particularly, a meta reinforcement learning framework is designed based on model-agnostic meta learning, which trains a meta policy offline and fast adapts to new MTO scenario within a few training steps. Finally, we conduct performance evaluation based on task generator DAGGEN and realistic vehicular traces, which shows that the SMRL-MTO reduces task execution time by 11.36% on average compared with greedy algorithm.
Read full abstract