Achieving 60 GFLOP/s on the production CFD code OVERFLOW-MLP

James R Taft

doi:10.1016/s0167-8191(00)00072-7

Abstract

NASA Ames has developed a new methodology for achieving very high levels of parallel efficiency on the new NUMA based shared memory symmetric Multi-Processing (SMP) computing systems available today. This methodology is simple, general, and widely applicable to production CFD codes in use at NASA and elsewhere. The new methodology is formally called shared memory Multi-Level Parallelism (MLP), and is based on shared memory access to global data while invoking two levels of parallelism for scaling efficiency. During the past year, this technique has been refined and improved and inserted into the OVERFLOW production CFD code. Executions of the new code on an SGI Origin R12K system with 512 CPUs have demonstrated over 60 GFLOP/s of sustained performance for customer driven real problems. A detailed discussion of the MLP technique, OVERFLOW-MLP code optimizations, and performance results are presented.

Full Text