Three-dimensional reverse-time migration with the constant-density acoustic wave equation requires an efficient numerical scheme for the computation of wavefields. An explicit finite-difference scheme in the time domain is a common choice. However, it requires a significant amount of disk space for the imaging condition. The frequency-domain approach simplifies the correlation of the source and receiver wavefields, but requires the solution of a large sparse linear system of equations. For the latter, we use an iterative Krylov solver based on a shifted Laplace multigrid preconditioner with matrix-dependent prolongation. The question is whether migration in the frequency domain can compete with a time-domain implementation when both are performed on a parallel architecture. Both methods are naturally parallel over shots, but the frequency-domain method is also parallel over frequencies. If we have a sufficiently large number of compute nodes, we can compute the result for each frequency in parallel and the required time is dominated by the number of iterations for the highest frequency. As a parallel architecture, we consider a commodity hardware cluster that consists of multicore central processing units (CPUs), each of them connected to two graphics processing units (GPUs). Here, GPUs are used as accelerators and not as an independent compute node. The parallel implementation of the 3D migration in frequency domain is compared to a time-domain implementation. We optimize the throughput of the latter with dynamic load balancing, asynchronous I/O, and compression of snapshots. Because the frequency-domain solver uses matrix-dependent prolongation, the coarse-grid operators require more storage than available on GPUs for problems of realistic size. Due to data transfer, there is no significant speedup using GPU-accelerators. Therefore, we consider an implementation on CPUs only. Nevertheless, with the parallelization over shots and frequencies, this approach could compete with the time-domain implementation on multiple GPUs.
Read full abstract