Abstract

The lens distortion process is essential for displaying VR contents on a head-mounted display (HMD) with a distorted display surface. This paper proposes a novel lens distortion algorithm to achieve real-time performance on edge devices with an embedded GPU. We employ unified memory space to reduce the data transfer overhead based on an architectural characteristic: an integrated CPU and GPU memory system. The lens distortion kernel is based on the lookup table-based mapping algorithm whose performance is bounded by memory operations rather than computations. To improve the kernel’s performance, we propose a compressed lookup table approach that reduces the memory transactions on the kernel while slightly increasing computation. We tested our method on three different edge devices and a desktop system while varying the image resolution from 720p (1, <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$280\times 720$ </tex-math></inline-formula> ) to 8K (7, <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$680\times 4.320$ </tex-math></inline-formula> ). Compared with prior GPU-based lookup table algorithms, our method achieved up to 1.72-times higher performance while consuming up to 28.93% less power. Also, our method demonstrates real-time performance for up to a 4K image with a low-end edge device (e.g., 56 FPS on Jetson Nano) and up to an 8K image with a mid-range device (e.g., 94 FPS on Jetson NX). These results demonstrate the benefits of our approach from the perspectives of both performance and energy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call