Abstract

The performance of supercomputers has traditionally been evaluated using the LINPACK benchmark [3], which stresses only the floating point units without significantly loading the memory or the network subsystems. The HPC Challenge (HPCC) benchmark suite is being proposed as an alternative to evaluate the performance of supercomputers. It consists of seven benchmarks, each designed to measure a specific aspect of the system performance. These benchmarks include (i) the high performance linpack (HPL) (ii) DGEMM, which measures the floating point rate of execution of double precision real matrix-matrix multiplication, (iii) STREAM that measures sustainable memory bandwidth and the corresponding computation rate for four simple vector kernels, namely, copy, scale, add and triad (iv) PTRANS that exercises the network by taking parallel transpose of a large distributed matrix (v) Randomaccess that measures the rate of integer updates to random memory locations (vi) FFT which measures the floating point rate of execution of a double precision complex one-dimensional Discrete Fourier Transform (DFT) and (vii) communication bandwidth and latency which measures latency and bandwidth of a number of simultaneous communication patterns. In this paper we outline the optimization techniques used to obtain the presently best reported performance of the HPCC Randomaccess benchmark on the Blue Gene/L supercomputer.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.