Abstract

With the fast development of space-air-ground computing scenarios, large UAVs, airships or HAPS (high altitude platform station), and satellites, are in the trend to have more powerful computation resources (e.g., heterogeneous types of GPUs), and can act as edge servers in the air. They are increasingly used for a large number of deep neural networks (DNN) inference applications, such as disaster monitoring, remote sensing, and agriculture inspection. However, these edge servers in the air always have a very limited energy supply. Thus, how to reduce their energy consumption to extend their working hours, while meeting the delay requirements of DNN inference tasks becomes a very important demand. In this paper, we propose MagicBatch, an energy-aware scheduling framework for DNN inference workloads on edge servers (with heterogeneous GPUs) in the air. MagicBatch is based on our key finding, that various GPUs can have different energy and latency performance under different DNN inference batch sizes. Thus, MagicBatch is designed in two phases: In the offline analysis phase, it analyzes the execution latency and energy consumption performance of different DNN inference tasks on heterogeneous GPUs; In the online scheduling phase, we propose a heuristic energy-aware scheduling algorithm (PSO-GA) to better allocate heterogeneous GPU computing resources to various inference tasks. Evaluation on our emulation testbed shows that MagicBatch can achieve more than 31.3% energy savings and 41.1% throughput improvement compared with the state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call