Abstract

The particle filter (PF) has during the last decade been proposed for a wide range of localization and tracking applications. There is a general need in such embedded system to have a platform for efficient and scalable implementation of the PF. One such platform is the graphics processing unit (GPU), originally aimed to be used for fast rendering of graphics. To achieve this, GPUs are equipped with a parallel architecture which can be exploited for general-purpose computing on GPU (GPGPU) as a complement to the central processing unit (CPU). In this paper, GPGPU techniques are used to make a parallel recursive Bayesian estimation implementation using particle filters. The modifications made to obtain a parallel particle filter, especially for the resampling step, are discussed and the performance of the resulting GPU implementation is compared to the one achieved with a traditional CPU implementation. The comparison is made using a minimal sensor network with bearings-only sensors. The resulting GPU filter, which is the first complete GPU implementation of a PF published to this date, is faster than the CPU filter when many particles are used, maintaining the same accuracy. The parallelization utilizes ideas that can be applicable for other applications.

Highlights

  • The signal processing community has for a long time been relying on Moore’s law, which in short says that the computer capacity doubles for each 18 months

  • graphics processing unit (GPU) are equipped with a parallel architecture which can be exploited for general-purpose computing on GPU (GPGPU) as a complement to the central processing unit (CPU)

  • The survey in [27] details a general PF framework for localization and tracking, and it points out the importance of utilizing model structure using the Rao-Blackwellized particle filter (RBPF), denoted marginalized particle filter (MPF) [28, 29]

Read more

Summary

Introduction

The signal processing community has for a long time been relying on Moore’s law, which in short says that the computer capacity doubles for each 18 months. The signal processing community has started to focus more on distributed and parallel implementations of the core algorithms. In this contribution, the focus is on distributed particle filter (PF) implementations. (viii) Optional step of computing marginal distribution of the state (the filter solution) rather than the state trajectory distribution This is O(N2) on a single core processor, but parallelizable to O(N). It requires massive communication between the particles This suggests the following basic functions of complexity for the extreme cases single core, M = 1, and complete parallelization, M/N → 1: Single-core : f1(N) = c1 + c2N, Multicore.

Parallel Programming
A GPU Particle Filter
GPU PF
Simulations
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.