Abstract

With the rapid development of wireless communication, achieving the neXt generation Ultra-Reliable and Low-Latency Communications (xURLLC) in 6G mobile communication systems has become a critical problem. Among many applications in xURLLC, deep learning model inference requires improvement over its efficiency. Due to the heterogeneous hardware environment in 6G, parallel schedules from distributed machine learning and edge computing has been borrowed to tackle the efficiency problem. However, traditional parallel schedules suffer from high latency, low throughput, and low device utility. In this paper, we propose Automatic Pipeline Parallelism ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">AP</i> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> ), a parallel inference framework for deep learning applications in 6G mobile communication systems, to improve the model inference efficiency while maintaining reliability. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">AP</i> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> contains three sub-modules. A task-device affinity predictor predicts a task’s expected execution time on a given device. The parallel inference arrangement optimizer finds the most suitable device for each task. The parallel inference scheduler converts the arrangement to a schedule that can be directly executed in the system. The experimental results show that <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">AP</i> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> can achieve better latency, throughput, reliability, and device utility than other parallel schedules. Also, the priority of the sub-module designs has been approved through the experiments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call