Abstract
Artificial intelligence models implemented in power-efficient Internet-of-Things (IoT) devices have accuracy degradation due to limited power consumption. To mitigate the accuracy loss on IoT devices, an edge-server joint inference system is introduced. On the edge-server inference system, allocate more workloads to the server end can mitigate accuracy loss, but data transmission contributes to the power consumption of the edge device. Thus, in this article, we present a novel two-stage method to allocate workloads to the server or the edge to maximize inference accuracy under a power constraint. In the first stage, we present a clusterwise threshold-based method for estimating the trustworthiness of a prediction made at the edge. In the second stage, we further determine the workload allocation of a trustworthy image based on the probability of the top 1 prediction and the power constraint. In addition, we propose a fine-tuning process to the pretrained model at the edge for achieving better accuracy. In the experiments, we apply the proposed method to several well-known deep neural network models. The results show that the proposed method can improve inference accuracy up to 3.93% under a specific power constraint compared to previous methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.