Abstract

Even HPC expert programmers need to invest considerable time and effort in empirically establishing effective performance tuning strategies for their target systems. When the target system is changed and/or updated, it is thus preferable for expert programmers if their performance tuning expertise can be ported to the new system as much as possible. In this paper, we focus on multiple generations of NEC SX series vector systems. We have documented the performance tuning expertise for the previous generations and built a machine-usable database of performance tuning cases. Therefore, this paper investigates how much the recorded expertise in the database can contribute to performance tuning for the latest generation, NEC SX-Aurora TSUBASA (SX-AT). Since the system architecture as well as the software stack such as compilers are totally renewed for SX-AT, this paper discusses the differences in performance tuning across system generations. In addition, this paper also discusses how to express performance tuning techniques in a machine-usable way. The case study in this paper indicates that the Xevolver's approach of using user-defined code transformations can express most of the vectorization-aware performance tuning techniques, and is thus promising for recording the performance tuning expertise in a future-proof fashion.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call