Abstract

We evaluate the performance of a parallel 3D finite-difference method (FDM) simulation of seismic wave propagation using the Intel Xeon Phi coprocessor. Since a continued decrease in the byte/flop ratio of future machines is forecast, program optimization with a decrease byte/flop ratio was applied by fusing the original major kernel and omitting the storing and loading of intermediate variables. We confirm that 1) MPI/OpenMP hybrid parallel computing with hyper-threading is more efficient than pure MPI parallel computing and 2) the performance of the FDM simulation with a splitting of triple DO loops is 1.3 times faster than the modified code with triple DO loops, while no performance acceleration is achieved with a fused double DO-loop calculation. We consider that loop distribution optimization is effective for prefetching and the thread parallelization of each loop by its use and reuse on cache data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call