Improving SIMD efficiency for parallel Monte Carlo light transport on the GPU

Dietger Van Antwerpen

doi:10.1145/2018323.2018330

Abstract

Monte Carlo Light Transport algorithms such as Path Tracing (PT), Bi-Directional Path Tracing (BDPT) and Metropolis Light Transport (MLT) make use of random walks to sample light transport paths. When parallelizing these algorithms on the GPU the stochastic termination of random walks results in an uneven workload between samples, which reduces SIMD efficiency. In this paper we propose to combine stream compaction and sample regeneration to keep SIMD efficiency high during random walk construction, in spite of stochastic termination. Furthermore, for BDPT and MLT, we propose to evaluate all bidirectional connections of a sample in parallel in order to balance the workload between GPU threads and improve SIMD efficiency during sample evaluation. We present efficient parallel GPU-only implementations for PT, BDPT, and MLT in CUDA. We show that our GPU implementations outperform similar CPU implementations by an order of magnitude.

Full Text