Abstract

High performance computing (HPC) demands huge memory bandwidth and computing resources to achieve maximum performance and energy efficiency. FPGAs can provide both, and with the help of High Level Synthesis, those HPC applications can be easily written in high level languages. However, the optimization process remains time-consuming, especially when based on trial-and-error bitstream generation. Model-based performance prediction is a practical and fast approach for kernel optimization, specially if done with information from pre-synthesis reports. This article presents an analytical model focused on memory intensive applications that captures the memory behavior and accurately predicts the kernel execution time within seconds rather than hours, as bitstream generation requires. The model has been validated with two DRAM technologies: DDR4 and HBM2, with a set of microbenchmarks and high performance computing applications showing an average error of 11% for DDR4 and 10% for HBM2. Compared with previous studies, our predictions at least halve the estimation error.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call