Acceleration of neural network model execution on embedded systems

Chang-Jiun Chen,May-Chen Martin-Kuo,Kai-Chun Chen

doi:10.1109/vlsi-dat.2018.8373246

Abstract

Deep learning has made various breakthroughs in on cognitions tasks. However, deep learning neural networks, by nature, is not only computation intensive but also memory intensive. Therefore, most of the existing deep learning applications are cloud-based, which then raise the concern of latency, privacy violation, and bandwidth consumption. Due to the trend of exponential growth of IOT (internet of things) devices, deploying deep learning applications to the edge devices is, or soon will be, required for the revolution. It is critical to tune the neural network models to accelerate the execution of deep learning applications on embedded systems. Even with the breakthroughs in the FPGAs or ASICs tailored for deep learning, it is proven necessary to adjust the neural network models for efficient execution. Many studies have been presented in the past few years; however, unfortunately, many approaches are not good fits for industrial productions. For example, while an accuracy loss from 99.9% to 99.0% is a mere less than 1% loss, but from the industrial point of view, it means the error rate grows by at least ten times, which might fail the applications to become mature products for the market. This paper presents recent advancements in neural network models. The compression techniques are categorized into four based on their principles. The first category focuses on reducing the model size. The approaches are mostly known by pruning, quantization, and compression. The second category focuses on speeding up the matrix multiplications. The attempts are made on matrix factorization and filtering. The third category takes advantage of repurposing the models based on domain knowledge. The representative works include knowledge distillation and transfer learning. The last category is of hybrid approaches. We conclude this paper with the discussion of the foreseeable challenges in the industrial applications.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Acceleration of neural network model execution on embedded systems

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

The Application of Deep Learning in Image Processing is Studied Based on the Reel Neural Network Model
Nan Xu
Journal of Physics: Conference Series | VOL. 1881
Nan XuNan Xu
01 Apr 2021
Journal of Physics: Conference Series | VOL. 1881

Edge-MultiAI: Multi-Tenancy of Latency-Sensitive Deep Learning Applications on Edge
Sm Zobaed ... Mathieu Kourouma
-
Sm Zobaed, et. al.Sm Zobaed ... Mathieu Kourouma
01 Dec 2022
01 Dec 2022

Applications of IOT using Deep Learning
Dr Anitha T N ... Dr Jayasudha K
-
Dr Anitha T N, et. al.Dr Anitha T N ... Dr Jayasudha K
15 Dec 2022
15 Dec 2022

A deep transfer learning approach for IoT/IIoT cyber attack detection using telemetry data
S Poonkuzhal ... M Shobana
Neural Network World | VOL. 33
S Poonkuzhal, et. al.S Poonkuzhal ... M Shobana
01 Jan 2023
Neural Network World | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Acceleration of neural network model execution on embedded systems

Abstract

Talk to us

Similar Papers