Ultra-fast Machine Learning Classifier Execution on IoT Devices without SRAM Consumption

Bharath Sudharsan,John G Breslin,Pankesh Patel,Muhammad Intizar Ali

doi:10.1109/percomworkshops51409.2021.9431061

Bharath Sudharsan, John G Breslin + Show 2 more

Open Access

https://doi.org/10.1109/percomworkshops51409.2021.9431061

Copy DOI

Abstract

With the introduction of edge analytics, IoT devices are becoming smart and ready for AI applications. A few modern ML frameworks are focusing on the generation of small-size ML models (often in kBs) that can directly be flashed and executed on tiny IoT devices, particularly the embedded systems. Edge analytics eliminates expensive device-to-cloud communications, thereby producing intelligent devices that can perform energy-efficient real-time offline analytics. Any increase in the training data results in a linear increase in the size and space complexity of the trained ML models, making them unable to be deployed on IoT devices with limited memory. To alleviate the memory issue, a few studies have focused on optimizing and fine-tuning existing ML algorithms to reduce their complexity and size. However, such optimization is usually dependent on the nature of IoT data being trained. In this paper, we presented an approach that protects model quality without requiring any alteration to the existing ML algorithms. We propose SRAM-optimized implementation and efficient deployment of widely used standard/stable ML-frameworks classifier versions (e.g., from Python scikit-learn). Our initial evaluation results have demonstrated that ours is the most resource-friendly approach, having a very limited memory footprint while executing large and complex ML models on MCU-based IoT devices, and can perform ultra-fast classifications while consuming 0 bytes of SRAM. When we tested our approach by executing it on a variety of MCU-based devices, the majority of models ported and executed produced 1-4x times faster inference results in comparison with the models ported by the sklearn-porter, m2cgen, and emlearn libraries.

Highlights

INTRODUCTIONForest (RFs) to solve ranking, regression, and classification problems locally at the device level
The vast majority of edge devices use simple supervisedML classifiers such as Decision Trees (DTs) and RandomForest (RFs) to solve ranking, regression, and classification problems locally at the device level
Multiple studies [1,2] have shown that treebased algorithms can only be implemented on embedded sensor systems or tiny IoT devices after their refinement and finetuning to comfortably fit within the specific hardware architecture

Summary

INTRODUCTION

Forest (RFs) to solve ranking, regression, and classification problems locally at the device level. When users try to execute complex tree-based models on edge devices such as smart doorbells, HVAC controllers, smart energy meters, etc., these high-quality models with a large number of tree nodes often cannot fit within the memory of MCUs, resulting in memory overflow issues. Sometimes users design sparse and shallow tree learners that only require a few kBs of memory [6] to keep a low memory footprint Such methods of learning shallow trees or aggressive pruning to fit within a few kBs often leads to degradation in accuracy due to approximation of non-linear and complex decision boundaries using a small number of axis-aligned hyperplanes. Contrary to the existing compression techniques, our approach reduces the size of ML models without any alterations and the standard classifiers trained using any dataset can be efficiently deployed and executed on MCUs. Even the autonomous tiny IoT devices can efficiently control real-world IoT applications by making timely predictions/decisions. Despite the reduced memory footprint, our approach guarantees the same level of performance (accuracy, F1 score, etc.) as its original models (before porting) when compared to high-resource lab setups

PROPOSED DESIGN

SRAM-optimized Porting of Trained Classifiers to C

EXPERIMENTAL EVALUATION

DISCUSSION AND FUTURE WORK

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Ultra-fast Machine Learning Classifier Execution on IoT Devices without SRAM Consumption

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Mar 22, 2021
Citations: 11	License type: cc-by

Similar Papers

Air Quality Analysis through IoT Device and Risk Prediction of Asthma Attack using ML Techniques
Avishek Banerjee ... Akash Yadav
Recent Advances in Electrical & Electronic Engineering (Formerly Recent Patents on Electrical & Electronic Engineering) | VOL. 17
Avishek Banerjee, et. al.Avishek Banerjee ... Akash Yadav
16 Aug 2024
Recent Advances in Electrical & Electronic Engineering (Formerly Recent Patents on Electrical & Electronic Engineering) | VOL. 17

Privacy Preservation Using Machine Learning in the Internet of Things
Sherif El-Gendy ... Anca Jurcut
Mathematics | VOL. 11
Sherif El-Gendy, et. al.Sherif El-Gendy ... Anca Jurcut
11 Aug 2023
Mathematics | VOL. 11

Analysis and Applications Finding of Wireless Sensors and IoT Devices With Artificial Intelligence/Machine Learning
R Ramya
-
R RamyaR Ramya
26 Jan 2024
26 Jan 2024

Combining simple and less time complex ML models with multivariate empirical mode decomposition to obtain accurate GHI forecast
Priya Gupta ... Rhythm Singh
Energy | VOL. 263
Priya Gupta, et. al.Priya Gupta ... Rhythm Singh
26 Oct 2022
Energy | VOL. 263

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Ultra-fast Machine Learning Classifier Execution on IoT Devices without SRAM Consumption

Abstract

Highlights

Summary

Talk to us

Similar Papers