Abstract

Context. Monte Carlo radiative transfer (MCRT) is a widely used technique to model the interaction between radiation and a medium. It plays an important role in astrophysical modelling and when these models are compared with observations. Aims. We present a novel approach to MCRT that addresses the challenging memory-access patterns of traditional MCRT algorithms, which prevent an optimal performance of MCRT simulations on modern hardware with a complex memory architecture. Methods. We reformulated the MCRT photon-packet life cycle as a task-based algorithm, whereby the computation is broken down into small tasks that are executed concurrently. Photon packets are stored in intermediate buffers, and tasks propagate photon packets through small parts of the computational domain, moving them from one buffer to another in the process. Results. Using the implementation of the new algorithm in the photoionization MCRT code CMACIONIZE 2.0, we show that the decomposition of the MCRT grid into small parts leads to a significant performance gain during the photon-packet propagation phase, which constitutes the bulk of an MCRT algorithm because memory caches are used more efficiently. Our new algorithm is faster by a factor 2 to 4 than an equivalent traditional algorithm and shows good strong scaling up to 30 threads. We briefly discuss adjustments to our new algorithm and extensions to other astrophysical MCRT applications. Conclusions. We show that optimising the memory access patterns of a memory-bound algorithm such as MCRT can yield significant performance gains.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call