Abstract

We propose a configurable model deployment architecture (CMDA) for edge AIaaS and present a flexible working mechanism by enabling the joint configuration of data quality ratios (DQRs) and model complexity ratios (MCRs) for the AI tasks. Along with commonly used resource allocation operations, the manager can improve the energy and delay performance of AI services with the desired quality of results (QoRs). We develop an energy-delay minimization problem under the framework of CMDA and propose a polynomial regression based relaxing method to solve the task configuration subproblem. We conduct experiments and simulations on the ImageNet classification and the common objects in context (COCO) object detection tasks using state-of-the-art deep learning models. We present the corresponding result quality tables (RQTs) and QoR regression models to illustrate the proposed method. The results of single task configuration and multi-task configuration and resource allocation on ImageNet classification and COCO object detection tasks demonstrate that the proposed method can achieve over 5× HDEC improvement compared with non-optimization schemes, and also show that joint configuration of DQR and MCR can achieve over 1.2× HDEC improvement compared with the methods that only configure DQR or MCR.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call