Performance Optimization of the 3D FDM Simulation of Seismic Wave Propagation on the Intel Xeon Phi Coprocessor Using the ppOpen-APPL/FDM Library

Futoshi Mori,Takashi Furumura,Masaharu Matsumoto

doi:10.1007/978-3-319-17353-5_6

Abstract

We evaluate the performance of a parallel 3D finite-difference method (FDM) simulation of seismic wave propagation using the Intel Xeon Phi coprocessor. Since a continued decrease in the byte/flop ratio of future machines is forecast, program optimization with a decrease byte/flop ratio was applied by fusing the original major kernel and omitting the storing and loading of intermediate variables. We confirm that 1) MPI/OpenMP hybrid parallel computing with hyper-threading is more efficient than pure MPI parallel computing and 2) the performance of the FDM simulation with a splitting of triple DO loops is 1.3 times faster than the modified code with triple DO loops, while no performance acceleration is achieved with a fused double DO-loop calculation. We consider that loop distribution optimization is effective for prefetching and the thread parallelization of each loop by its use and reuse on cache data.

Full Text