Variable block size motion estimation implementation on compute unified device architecture (CUDA)

Dong-Kyu Lee Dong-Kyu Lee,Seoung-Jun Oh Seoung-Jun Oh

doi:10.1109/icce.2013.6487048

Variable block size motion estimation implementation on compute unified device architecture (CUDA)

Dong-Kyu Lee Dong-Kyu Lee, Seoung-Jun Oh Seoung-Jun Oh

https://doi.org/10.1109/icce.2013.6487048

Copy DOI

Publication Date: Jan 1, 2013

Citations: 9

Affiliation: Kwangwoon University

#Full Search Motion Estimation Algorithm #Variable Block Size Motion Estimation + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

This paper proposes a highly parallel variable block size full search motion estimation algorithm with concurrent parallel reduction (CPR) on graphics processing unit (GPU) using compute unified device architecture (CUDA). This approach minimizes memory access latency by using high-speed on-chip memory of GPU. By applying parallel reductions concurrently depending on the amount of data and the data dependency, the proposed approach increases thread utilization and decreases the number of synchronization points which cause latency. Experimental results show that the proposed approach achieves substantial improvement up to 92 times than the central processing unit (CPU) only counterpart.

Full Text