Performance analysis of deep learning workloads using roofline trajectories

M Haseeb Javed,Xiaoyi Lu,Khaled Z Ibrahim

doi:10.1007/s42514-019-00018-4

Abstract

Over the last decade, technologies derived from convolutional neural networks (CNNs) called Deep Learning applications, have revolutionized fields as diverse as cancer detection, self-driving cars, virtual assistants, etc. However, many users of such applications are not experts in Machine Learning itself. Consequently, there is limited knowledge among the community to run such applications in an optimized manner. The performance question for Deep Learning applications has typically been addressed by employing bespoke hardware (e.g., GPUs) better suited for such compute-intensive operations. However, such a degree of performance is only accessibly at increasingly high financial costs leaving only big corporations and governments with resources sufficient enough to employ them at a large scale. As a result, an average user is only left with access to commodity clusters with, in many cases, only CPUs as the sole processing element. For such users to make effective use of resources at their disposal, concerted efforts are necessary to figure out optimal hardware and software configurations. This study is one such step in this direction as we use the Roofline model to perform a systematic analysis of representative CNN models and identify opportunities for black box and application-aware optimizations. Using the findings from our study, we are able to obtain up to 3.5$$\times$$ speedup compared to vanilla TensorFlow with default configurations.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Performance analysis of deep learning workloads using roofline trajectories

Abstract

Talk to us

Similar Papers

More From: CCF Transactions on High Performance Computing

Lead the way for us

Journal: CCF Transactions on High Performance Computing	Publication Date: Nov 29, 2019
Citations: 7

Similar Papers

Advances in AI and machine learning for predictive medicine.
Alok Sharma ... Tatsuhiko Tsunoda
Journal of Human Genetics | VOL. -
Alok Sharma, et. al.Alok Sharma ... Tatsuhiko Tsunoda
29 Feb 2024
Journal of Human Genetics | VOL. -

Will machine learning end the viability of radiology as a thriving medical specialty?
Stephen Chan ... Eliot L Siegel
The British Journal of Radiology | VOL. 92
Stephen Chan, et. al.Stephen Chan ... Eliot L Siegel
01 Nov 2018
The British Journal of Radiology | VOL. 92

Modeling the Resource Requirements of Convolutional Neural Networks on Mobile Devices
Zongqing Lu ... Kevin Chan
-
Zongqing Lu, et. al.Zongqing Lu ... Kevin Chan
23 Oct 2017
23 Oct 2017

A Review of the Application of Deep Learning in Brachytherapy
Hai Hu ... Yang Shao
OALib | VOL. 07
Hai Hu, et. al.Hai Hu ... Yang Shao
01 Jan 2020
OALib | VOL. 07

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Performance analysis of deep learning workloads using roofline trajectories

Abstract

Talk to us

Similar Papers

More From: CCF Transactions on High Performance Computing