Abstract

The majority of Internet of Things (IoT) devices are tiny embedded systems with a micro-controller unit (MCU) as its brain. The memory footprint (SRAM, Flash, and EEPROM) of such MCU-based devices is often very limited, restricting onboard Machine Learning (ML) model training for large trainsets with high feature dimensions. To cope with memory issues, the current edge analytics approaches train high-quality ML models on the cloud GPUs (uses large volume historical data), then deploy the deep optimized version of the resultant models on edge devices for inference. Such approaches are inefficient in concept drift situations where the data generated at the device level vary frequently, and trained models are clueless on how to behave if previously unseen data arrives. In this paper, we present Train++, an incremental training algorithm that trains ML models locally at the device level (e.g., on MCUs and small CPUs) using the full n-samples of high-dimensional data. Train++ transforms even the most resource-constrained MCU-based IoT edge devices into intelligent devices that can locally build their own knowledge base on-the-fly using the live data, thus creating smart self-learning and autonomous problem-solving devices. Train++ algorithm is extensively evaluated on 5 popular MCU-boards, using 7 datasets of varying sizes and feature dimensions. A few exciting findings when analyzing the evaluation results are: (i) The proposed method reduces the onboard binary classifier training time by ≈ 10 - 226 sec across various commodity MCUs; (ii) Train++ can infer on MCUs for the entire test set in real-time of 1 ms; (iii) The accuracy improved by 5.15 - 7.3% since the incremental characteristic of Train++ enabled the loading of full n-samples of the high-dimensional datasets even on MCUs with only a few hundred kBs of memory.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call