Abstract
The potential of FPGAs as accelerators for high-performance computing applications is very large, but many factors are involved in their performance. The design for FPGAs and the selection of the proper optimizations when mapping computations to FPGAs lead to prohibitively long developing time. Alternatives are the high-level synthesis (HLS) tools, which promise a fast design space exploration due to design at high-level or analytical performance models which provide realistic performance expectations, potential impediments to performance, and optimization guidelines. In this paper we propose the combination of both, in order to construct a performance model for FPGAs which is able to visually condense all the helpful information for the designer. Our proposed model extends the roofline model, by considering the resource consumption and the parameters used in the HLS tools, to maximize the performance and the resource utilization within the area of the FPGA. The proposed model is applied to optimize the design exploration of a class of window-based image processing applications using two different HLS tools. The results show the accuracy of the model as well as its flexibility to be combined with any HLS tool.
Highlights
Field programmable gate arrays (FPGAs) are a programmable and massively parallel architecture offering great performance potential for computing intensive applications
FPGA performance models have been already proposed in the past ([10] and especially [11]); the high-level synthesis (HLS) tools were not mature enough at that time to be included in the model
By analysing the main characteristics of the original roofline model we missed the connection between the computational power and the resource consumption, which is one of the most important parameters on FPGAs
Summary
Field programmable gate arrays (FPGAs) are a programmable and massively parallel architecture offering great performance potential for computing intensive applications. A performance analysis is required in order to estimate the achievable level of performance for a particular application, even before starting the implementation. These models identify potential bottlenecks, the most appropriate optimizations, and maximum peaks of performance. Most of the current HLS tools provide detailed information about the performance and an estimation of the resource consumption of each implementation. On the other hand, knowing the resource consumption of each design and the available resources of the target FPGA allows us to estimate the replication level. The extended model provides an implementation guideline about the impact of the optimizations and about performance estimations considering the available resources.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have