Abstract
Edge computing is one of the recent computing architectures which utilizes computing resources near the data source or service endpoint to provide a service with better Quality of Service, and recently applying Deep Neural Network (DNN) on the edge is gaining focus from Artificial Intelligence (AI) service area. To place an AI application on an edge needs to be optimized before being deployed since usually edge device has limited computing resources comparing to those on the clouds. However, this optimization process usually degrades the accuracy on the AI since the precision on AI model parameters becomes lower during this optimization phase. In this paper, a novel method on DNN deployment has been proposed. Instead of optimizing parameter of an AI model, the proposed methods divide AI model into more than two parts and place each one on the edge and the cloud. Only partial part of AI model will be placed on the edge, and since there is no parameter optimization, there is no loss of accuracy. To easily maintain and access the deployed AI, a container structure to turn an AI into a microservice has been proposed as well. Test was processed with two AI microservice that contain partial AI models in each, and the result shows increase in end-to-end service delay with no loss of accuracy.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have