Abstract
Deep neural networks (DNN) have enabled dramatic advancements in applications such as video analytics, speech recognition, and autonomous navigation. More accurate DNN models typically have higher computational complexity. However, many mobile devices do not have sufficient resources to complete inference tasks using the more accurate DNN models under strict latency requirements. Edge intelligence is a strategy that solves this issue by offloading DNN inference tasks from end devices to more powerful edge servers. Some existing works focus on optimizing the inference task allocation and scheduling on edge servers. Other works focus on dynamically adapting the inference quality. In this work, we propose combining strategies from both research areas to serve applications that use deep neural networks to perform inference on offloaded video frames. The goals of the system are to maximize the accuracy of inference results and the number of requests the edge cluster can serve while meeting latency requirements. We propose heuristic algorithms to jointly adapt model quality and route inference requests, leveraging techniques that include model selection, dynamic batching, and frame resizing. We evaluated the proposed system in testbed experiments, comparing it to a baseline deployment with no quality adaptation. Our system provided 9.7% higher accuracy at low loads and met deadlines for 43.5% more frames at high loads. We also evaluated the proposed system in simulated experiments, where the system processed 16% more accurate frames per second than a solution with only quality adaptation and 45% more than one with only intelligent request routing.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.