Abstract

Various Deep Neural Networks (DNN) have served as the fundamental building blocks of a broad spectrum of machine learning applications due to its superior performance. However, it is not an easy task to apply or deploy deep learning techniques on the rapidly increasing edge devices such as wearable Internet of Things (IoT), smartphones or smart health devices with embedded sensors due to the limited computation and memory resources. It is desirable to develop efficient systems design and algorithms for the wide deployment of deep learning inferences on edge devices. To address this problem, we propose a set of hardware-friendly structured model pruning and compiler optimization techniques to accelerate DNN executions on edge devices. The structured model pruning is adopted to satisfy the limited computation and memory resource constraints and enable potential hardware acceleration. In the meantime, the compile optimization is utilized to further implement superior DNN inference acceleration performance on edge devices. With the proposed techniques, we are able to achieve real-time DNN inferences on edge devices as shown in the demo with various DNN applications deployed on mobile devices. These techniques enable us to explore impactful solutions with deep learning algorithms on cheaper affordable wearable IoTs to help in the well beings of users. Specifically, better personalization of health related solutions can help to care for the health and enhance users' experience with the superior performance of deep learning on smart health devices.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call