Abstract
Modern imaging sensors with higher megapixel resolution and frame rates are being increasingly used for wide-area video surveillance (VS). This has produced an accelerated demand for high-performance implementation of VS algorithms for real-time processing of high-resolution videos. The emergence of multi-core architectures and graphics processing units (GPUs) provides energy and cost-efficient platform to meet the real-time processing needs by extracting data level parallelism in such algorithms. However, the potential benefits of these architectures can only be realized by developing fine-grained parallelization strategies and algorithm innovation. This paper describes parallel implementation of video object detection algorithms like Gaussians mixture model (GMM) for background modelling, morphological operations for post-processing and connected component labelling (CCL) for blob labelling. Novel parallelization strategies and fine-grained optimization techniques are described for fully exploiting the computational capacity of CUDA cores on GPUs. Experimental results show parallel GPU implementation achieves significant speedups of ~250× for binary morphology, ~15× for GMM and ~2× for CCL when compared to sequential implementation running on Intel Xeon processor, resulting in processing of 22.3 frames per second for HD videos.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have