Synergy

Guanwen Zhong,Cheng Tan,Tulika Mitra,Akshat Dubey

doi:10.1145/3301278

Abstract

Convolutional Neural Networks (CNN) have been widely deployed in diverse application domains. There has been significant progress in accelerating both their training and inference using high-performance GPUs, FPGAs, and custom ASICs for datacenter-scale environments. The recent proliferation of mobile and Internet of Things (IoT) devices have necessitated real-time, energy-efficient deep neural network inference on embedded-class, resource-constrained platforms. In this context, we present Synergy , an automated, hardware-software co-designed, pipelined, high-throughput CNN inference framework on embedded heterogeneous system-on-chip (SoC) architectures (Xilinx Zynq). Synergy leverages, through multi-threading, all the available on-chip resources, which includes the dual-core ARM processor along with the FPGA and the NEON Single-Instruction Multiple-Data (SIMD) engines as accelerators. Moreover, Synergy provides a unified abstraction of the heterogeneous accelerators (FPGA and NEON) and can adapt to different network configurations at runtime without changing the underlying hardware accelerator architecture by balancing workload across accelerators through work-stealing. Synergy achieves 7.3X speedup, averaged across seven CNN models, over a well-optimized software-only solution. Synergy demonstrates substantially better throughput and energy-efficiency compared to the contemporary CNN implementations on the same SoC architecture.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Synergy

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Embedded Computing Systems

Lead the way for us

Journal: ACM Transactions on Embedded Computing Systems	Publication Date: Mar 18, 2019
Citations: 33

Similar Papers

Tunnel boring machine vibration-based deep learning for the ground identification of working faces
Mengbo Liu ... Yanqing Men
Journal of Rock Mechanics and Geotechnical Engineering | VOL. 13
Mengbo Liu, et. al.Mengbo Liu ... Yanqing Men
01 Dec 2021
Journal of Rock Mechanics and Geotechnical Engineering | VOL. 13

Joint compressing and partitioning of CNNs for fast edge-cloud collaborative intelligence for IoT
Wanpeng Zhang ... Tongquan Wei
Journal of Systems Architecture | VOL. 125
Wanpeng Zhang, et. al.Wanpeng Zhang ... Tongquan Wei
23 Mar 2022
Journal of Systems Architecture | VOL. 125

Artificial intelligence: finding the intersection of predictive modeling and clinical utility
Karthik Ravi
Gastrointestinal Endoscopy | VOL. 93
Karthik RaviKarthik Ravi
07 Mar 2021
Gastrointestinal Endoscopy | VOL. 93

LCIP: a retargetable framework for optimized CNN inference
Lei Pan ... Shuvra S Bhattacharyya
-
Lei Pan, et. al.Lei Pan ... Shuvra S Bhattacharyya
13 Jun 2023
13 Jun 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Synergy

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Embedded Computing Systems