Architectural Analysis of Deep Learning on Edge Accelerators

Luke Kljucaric,Alex Johnson,Alan D George

doi:10.1109/hpec43674.2020.9286209

Abstract

As computer architectures continue to integrate application-specific hardware, it is critical to understand the relative performance of devices for maximum app acceleration. The goal of benchmarking suites, such as MLPerf for analyzing machine-learning (ML) hardware performance, is to standardize a fair comparison of different hardware architectures. However, there are many apps that are not well represented by these standards that require different workloads, such as ML models and datasets, to achieve similar goals. Additionally, many devices feature hardware optimized for data types other than 32-bit floating-point numbers, the standard representation defined by MLPerf. Edge-computing devices often feature app-specific hardware to offload common operations found in ML apps from the constrained CPU. This research analyzes multiple low-power compute architectures that feature ML-specific hardware on a case study of handwritten Chinese character recognition. Specifically, AlexNet and a custom version of GoogLeNet are benchmarked in terms of their streaming latency for optical character recognition. Considering these models are custom and not the most widely used, many architectures are not specifically optimized for them. The performance of these models can stress devices in different, yet insightful, ways that generalizations of the performance of other models can be drawn from. The NVIDIA Jetson AGX Xavier (AGX), Intel Neural Compute Stick 2 (NCS2), and Google Edge TPU architectures are analyzed with respect to their performance. The design of the AGX and TPU devices showcased the lowest streaming latency for AlexNet and GoogLeNet, respectively. Additionally, the tightly-integrated N CS2 design showed the best generalizability in performance and efficiency across neural networks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Architectural Analysis of Deep Learning on Edge Accelerators

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Deep Learning Inferencing with High-performance Hardware Accelerators
Luke Kljucaric ... Alan D George
ACM Transactions on Intelligent Systems and Technology | VOL. 14
Luke Kljucaric, et. al.Luke Kljucaric ... Alan D George
15 Jun 2023
ACM Transactions on Intelligent Systems and Technology | VOL. 14

A Prediction Model for Spot LNG Prices Based on Machine Learning Algorithms to Reduce Fluctuation Risks in Purchasing Prices
Sun-Feel Yang ... Eul-Bum Lee
Energies | VOL. 16
Sun-Feel Yang, et. al.Sun-Feel Yang ... Eul-Bum Lee
23 May 2023
Energies | VOL. 16

Automated En Masse Machine Learning Model Generation Shows Comparable Performance as Classic Regression Models for Predicting Delayed Graft Function in Renal Allografts.
Kuang-Yu Jen ... Felicia Yen
Transplantation | VOL. 105
Kuang-Yu Jen, et. al.Kuang-Yu Jen ... Felicia Yen
22 Nov 2021
Transplantation | VOL. 105

Modeling soil temperature using air temperature features in diverse climatic conditions with complementary machine learning models
Maryam Bayatvarkeshi ... Zaher Mundher Yaseen
Computers and Electronics in Agriculture | VOL. 185
Maryam Bayatvarkeshi, et. al.Maryam Bayatvarkeshi ... Zaher Mundher Yaseen
30 Apr 2021
Computers and Electronics in Agriculture | VOL. 185

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Architectural Analysis of Deep Learning on Edge Accelerators

Abstract

Talk to us

Similar Papers