Abstract
The lens distortion process is essential for displaying VR contents on a head-mounted display (HMD) with a distorted display surface. This paper proposes a novel lens distortion algorithm to achieve real-time performance on edge devices with an embedded GPU. We employ unified memory space to reduce the data transfer overhead based on an architectural characteristic: an integrated CPU and GPU memory system. The lens distortion kernel is based on the lookup table-based mapping algorithm whose performance is bounded by memory operations rather than computations. To improve the kernel’s performance, we propose a compressed lookup table approach that reduces the memory transactions on the kernel while slightly increasing computation. We tested our method on three different edge devices and a desktop system while varying the image resolution from 720p (1, <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$280\times 720$ </tex-math></inline-formula> ) to 8K (7, <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$680\times 4.320$ </tex-math></inline-formula> ). Compared with prior GPU-based lookup table algorithms, our method achieved up to 1.72-times higher performance while consuming up to 28.93% less power. Also, our method demonstrates real-time performance for up to a 4K image with a low-end edge device (e.g., 56 FPS on Jetson Nano) and up to an 8K image with a mid-range device (e.g., 94 FPS on Jetson NX). These results demonstrate the benefits of our approach from the perspectives of both performance and energy.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have