Abstract

Particle Filters (PFs) are Sequential Monte Carlo methods which are widely used to solve filtering problems of dynamic models under Non-Linear Non-Gaussian noise. Modern PF applications have demanding accuracy and run-time constraints that can be addressed through parallel computing. However, an efficient parallelization of PFs can only be achieved by effectively parallelizing the bottleneck: resampling and its constituent redistribution step. A pre-existing implementation of redistribute on Shared Memory Architectures (SMAs) achieves $O(\frac{N}{T}log_2N)$ time complexity over $T$ parallel cores. This redistribute implementation is, however, highly computationally intensive and cannot be effectively parallelized due to the inherently limited number of cores of SMAs. In this paper, we propose a novel parallel redistribute on OpenMP 4.5 which takes $O(\frac{N}{T} + log_2N)$ steps and fully exploits the computational power of SMAs. The proposed approach is up to six times faster than the $O(\frac{N}{T}log_2N)$ one and its implementation on GPU provides a further three-time speed-up vs its equivalent on a 32-core CPU. We also show on an exemplary PF that our redistribution is no longer the bottleneck.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.