Scalable 2D K-SVD parallel algorithm for dictionary learning on GPUs

Lu He,Yan Luo,Hengyong Yu,Rui Liu,Timothy Miskell,Huijuan Xu

doi:10.1145/2903150.2903176

Abstract

In recent years, the K-SVD algorithm for dictionary learning has been widely used in the field of image processing. The learning algorithm constructs a dictionary consisting of groups of signal atoms derived from a set of images. The sparse linear combination of the signal atoms are used to construct the best possible match based upon the original images. The myriad applications of K-SVD algorithm include reconstruction, compression, denoising, sparse coding, super resolution, and feature extraction. The K-SVD algorithm is a serial machine learning algorithm whereby each of the signal atoms are trained in succession. All of the signal atoms are updated once within any given iteration. Given that the algorithmic complexity for one iteration is O(n4), the training phase of the K-SVD algorithm is time-intensive. In order to increase the speed the K-SVD algorithm and reduce the run-time execution of each iteration, the following paper proposes a parallel version of the K-SVD algorithm and verifies its validity. We design and optimize the parallel algorithm on an Nvidia Titan X GPU by employing three strategies, specifically batches, early stop, and streaming. Experimental results indicate that the parallel algorithm produces a pronounced speedup of 80x when compared to multi-thread MATLAB implementation of the K-SVD algorithm running on a quad-core CPU.

Full Text