Abstract

Modern imaging sensors with higher megapixel resolution and frame rates are being increasingly used for wide-area video surveillance (VS). This has produced an accelerated demand for high-performance implementation of VS algorithms for real-time processing of high-resolution videos. The emergence of multi-core architectures and graphics processing units (GPUs) provides energy and cost-efficient platform to meet the real-time processing needs by extracting data level parallelism in such algorithms. However, the potential benefits of these architectures can only be realized by developing fine-grained parallelization strategies and algorithm innovation. This paper describes parallel implementation of video object detection algorithms like Gaussians mixture model (GMM) for background modelling, morphological operations for post-processing and connected component labelling (CCL) for blob labelling. Novel parallelization strategies and fine-grained optimization techniques are described for fully exploiting the computational capacity of CUDA cores on GPUs. Experimental results show parallel GPU implementation achieves significant speedups of ~250× for binary morphology, ~15× for GMM and ~2× for CCL when compared to sequential implementation running on Intel Xeon processor, resulting in processing of 22.3 frames per second for HD videos.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call