Profiling and optimizing deep learning inference on mobile GPUs

Shiqi Jiang,Yunxin Liu,Yusen Xu,Ting Cao,Lihao Ran

doi:10.1145/3409963.3410493

Profiling and optimizing deep learning inference on mobile GPUs

Shiqi Jiang, Yunxin Liu + Show 3 more

https://doi.org/10.1145/3409963.3410493

Copy DOI

Publication Date: Aug 24, 2020

Citations: 8

Affiliation: Microsoft Research (United Kingdom)

#Mobile GPUs #Deep Learning Inference + Show 7 more

Abstract
Full-Text
Similar Papers

Abstract

Mobile GPU, as the ubiquitous computing hardware on almost every smartphone, is being exploited for the deep learning inference. In this paper, we present our measurements on the inference performance with mobile GPUs. Our observations suggest that mobile GPUs are underutilized. We study the inefficient issue in depth and find that one of root causes is the improper partition of compute workload. To solve this, we propose a heuristics-based workload partitioning approach, considering both performance and overheads on mobile devices. Evaluation results show that our approach reduces the inference latency by up to 32.8% on various devices and neural networks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.