Abstract

A cost-effective implementation of Convolutional Neural Nets on the mobile edge of the Internet-of-Things (IoT) requires smart optimizations to fit large models into memory-constrained cores. Reduction methods that use a joint combination of filter pruning and weight quantization have proven efficient in searching the compression that ensures minimum model size without accuracy loss. However, there exist other optimal configurations that stem from the memory constraint. The objective of this work is to make an assessment of such memory-bounded implementations and to show that most of them are centred on specific parameter settings that are found difficult to be implemented on a low-power RISC. Hence, the focus is on quantifying the distance to optimality of the closest implementations that instead can be actually deployed on hardware. The analysis is powered by a two-stage framework that efficiently explores the memory-accuracy space using a lightweight, hardware-conscious heuristic optimization. Results are collected from three realistic IoT tasks (Image Classification on CIFAR-10, Keyword Spotting on the Speech Commands Dataset, Facial Expression Recognition on Fer2013) run on RISC cores (Cortex-M by ARM) with few hundreds KB of on-chip RAM.

Highlights

  • AND MOTIVATIONSMost IoT applications run Deep Convolutional Neural Networks (ConvNets hereafter) in the cloud, public or private depending on the context

  • The analysis aims to assess the optimality of hardware-compliant implementations and quantify their distance from theoretical solutions

  • We introduce the ConvNets adopted as test-cases, together with the datasets used for the training stage and the evaluation

Read more

Summary

Introduction

AND MOTIVATIONSMost IoT applications run Deep Convolutional Neural Networks (ConvNets hereafter) in the cloud, public or private depending on the context. The focus of this work is on low-cost IoT applications (e.g. that described in [2]) where form-factor and energy budget are the main concern. In such cases, the software stack is developed over off-theshelf embedded platforms powered by tiny RISC cores. In the early years of life, ConvNets were mainly optimized to improve accuracy This brought to an exponential increase in size and complexity. The rise of edge computing pushed memory and storage capacity in the loop During this fast evolution, several optimization methods have been introduced and tested on different architectures; a thorough overview is reported in [3]. This section gives a critical review of prior arts, motivating the choices implemented in this work

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call