Abstract

Large Machine Learning (ML) models require considerable computing resources and raise challenges for integrating them with the decentralized operation of heterogeneous and resource-constrained Internet of Things (IoT) devices. Running ML tasks on the cloud can introduce network delay, throughput, and privacy concerns, whereas running ML tasks on IoT devices is penalized by their constrained resources. For this reason, recent research proposed cooperative execution of ML tasks over IoT networks but disregarded resource variability and the IoT devices’ energy constraints simultaneously. In this paper, we propose Early Exit of Computation (EEoC), an adaptive, energy-efficient, low-latency inference scheme over IoT networks. EEoC adaptively distributes the inference computation load between the IoT device and the edge server, based on estimated communication and computation resources, to jointly minimize prediction latency and energy consumption. We evaluate our solution’s latency and energy profile on a real testbed running two widely used neural networks. Results show that EEoC can reduce latency and energy consumption up to 24.6% and 46.5%, respectively, compared to other state-of-the-art solutions without sacrificing accuracy.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.