Inference using deep neural networks on mobile devices has been an active area of research in recent years. The design of a deep learning inference framework targeted for mobile devices needs to consider various factors, such as the limited computational capacity of the devices, low power budget, varied memory access methods, and I/O bus bandwidth governed by the underlying processor’s architecture. Furthermore, integrating an inference framework with time-sensitive applications — such as games and video-based software to perform tasks like ray tracing denoising and video processing — introduces the need to minimize data movement between processors and increase data locality in the target processor. In this paper, we propose Shader Neural Network (ShaderNN), an OpenGL-based, fast, and power-efficient inference framework designed for mobile devices to address these challenges. Our contributions include the following: (1) the texture-based input/output provides an efficient, zero-copy integration with real-time graphics pipelines or image processing applications, thereby saving expensive data transfers between CPU and GPU, which are unavoidable in most existing inference engines; (2) we are the first to leverage fragment shaders based on the OpenGL backend in neural network inference operators, which has an advantage in deploying parametrically small neural network models; (3) a hybrid implementation of the compute shader and fragment shader is proposed that enables layer-level shader selection to boost performance; and (4) we utilize OpenGL features — such as normalization, interpolation and texture padding — to improve performance. Experiments illustrate the favorable performance of ShaderNN over other popular on-device deep learning frameworks such as TensorFlow-Lite on the latest mobile devices powered by Qualcomm and MediaTek chips. A case study further demonstrates the usability and integration of the ShaderNN framework with a media processing Android application seamlessly. ShaderNN is available open source at Github (https://github.com/inferenceengine/shadernn).
Read full abstract