Differentiable Neural Network Pruning to Enable Smart Applications on Microcontrollers

Edgar Liberis,Nicholas D Lane

doi:10.1145/3569468

Abstract

Wearable, embedded, and IoT devices are a centrepiece of many ubiquitous computing applications, such as fitness tracking, health monitoring, home security and voice assistants. By gathering user data through a variety of sensors and leveraging machine learning (ML), applications can adapt their behaviour: in other words, devices become "smart". Such devices are typically powered by microcontroller units (MCUs). As MCUs continue to improve, smart devices become capable of performing a non-trivial amount of sensing and data processing, including machine learning inference, which results in a greater degree of user data privacy and autonomy, compared to offloading the execution of ML models to another device. Advanced predictive capabilities across many tasks make neural networks an attractive ML model for ubiquitous computing applications; however, on-device inference on MCUs remains extremely challenging. Orders of magnitude less storage, memory and computational ability, compared to what is typically required to execute neural networks, impose strict structural constraints on the network architecture and call for specialist model compression methodology. In this work, we present a differentiable structured pruning method for convolutional neural networks, which integrates a model's MCU-specific resource usage and parameter importance feedback to obtain highly compressed yet accurate models. Compared to related network pruning work, compressed models are more accurate due to better use of MCU resource budget, and compared to MCU specialist work, compressed models are produced faster. The user only needs to specify the amount of available computational resources and the pruning algorithm will automatically compress the network during training to satisfy them. We evaluate our methodology using benchmark image and audio classification tasks and find that it (a) improves key resource usage of neural networks up to 80x; (b) has little to no overhead or even improves model training time; (c) produces compressed models with matching or improved resource usage up to 1.4x in less time compared to prior MCU-specific model compression methods.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Differentiable Neural Network Pruning to Enable Smart Applications on Microcontrollers

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies

Lead the way for us

Journal: Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies	Publication Date: Dec 21, 2022
Citations: 8

Similar Papers

Comparative Analysis of Machine Learning Models for Short-Term Net Load Forecasting in Renewable Integrated Microgrids
Georgios Tziolis ... Javier Lopez-Lorente
-
Georgios Tziolis, et. al.Georgios Tziolis ... Javier Lopez-Lorente
17 Oct 2022
17 Oct 2022

Comparing Machine Learning Models and Hybrid Geostatistical Methods Using Environmental and Soil Covariates for Soil pH Prediction
Panagiotis Tziachris ... Vassilis Aschonitis
ISPRS International Journal of Geo-Information | VOL. 9
Panagiotis Tziachris, et. al.Panagiotis Tziachris ... Vassilis Aschonitis
23 Apr 2020
ISPRS International Journal of Geo-Information | VOL. 9

CORR Synthesis: When Should the Orthopaedic Surgeon Use Artificial Intelligence, Machine Learning, and Deep Learning?
Michael P Murphy ... Nicholas M Brown
Clinical orthopaedics and related research | VOL. 479
Michael P Murphy, et. al.Michael P Murphy ... Nicholas M Brown
17 Feb 2021
Clinical orthopaedics and related research | VOL. 479

Evaluating Physician-AI Interaction for Multiple Myeloma Management: Paving the Path Towards Precision Oncology
Barbara D Lam ... Fernando A Acosta-Perez
Blood | VOL. 142
Barbara D Lam, et. al.Barbara D Lam ... Fernando A Acosta-Perez
28 Nov 2023
Blood | VOL. 142

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Differentiable Neural Network Pruning to Enable Smart Applications on Microcontrollers

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies