Automating FPGA-based CSoC platform generation

Wijesundera, Deshya Senelie Siriwardena

doi:10.32657/10220/48432

Abstract

Field Programmable Gate Array (FPGA) based configurable system-on-chip (CSoC) platforms have become a preferred choice for embedded computing systems to meet the increasing demand for shorter Time-to-Market (TTM) and lower Non-Recurring Engineering (NRE) costs, due to both high density and myriad of on-chip hardware and software compute resources. However, the inability of existing tools to effectively exploit these resources to satisfy design constraints, especially from high level specifications such as C/C++, remains a bottleneck for meeting the TTM pressures. In this research, techniques for the automatic generation of a FPGA-based CSoC platform have been proposed to satisfy the area-time design constraints by taking user preferences into account. A rapid technique has been proposed to estimate application performance (runtime) on soft core processors. The proposed methodology relies on the target independent intermediate representation (IR) of the LLVM compiler, without necessitating application execution on the target processor or instruction set simulators, thereby making it applicable to other soft core processors and corresponding FPGA architectures. Further, the approach is scalable to the large number of configuration options available in modern soft core processors. The technique takes into account both data hazards and control hazards within the processor pipeline in order to obtain high estimation accuracy. Experimental results using applications from the CHStone benchmark suite on two commercial soft core processors, Xilinx MicroBlaze and Altera Nios show an error of only 5% averaged across the full design space. Noting that modern FPGA platforms typically also integrate hard core processors, the technique proposed for soft core processors has been extended to support the performance estimation for hard core processors. This necessitated the introduction of models for addressing performance-centric features such as dual-issue, out-of-order and superscalar. The proposed technique takes into account data hazards, control hazards and structural hazards within the processor pipeline in order to obtain high estimation accuracy. The technique has been tested using applications from the CHStone benchmark suite for the ARM Cortex-A9 processor in a Xilinx Zynq SoC FPGA and has been shown to be accurate with an average estimation error of only 5.84%. The estimation accuracy compares well with that of soft core processor performance estimation. Moreover, a unified framework to facilitate the performance estimation on both soft core and hard core processors has been proposed and described. A novel technique for hardware area-time estimation of applications on FPGA has been proposed. The application C code was first converted to the target independent LLVM IR prior to wrapping the basic blocks as functions using a LLVM transformation pass. The LegUp tool’s ‘LLVM IR functions to RTL modules’ conversion was carried out to facilitate RTL synthesis using the Altera Quartus tools. In order to support FPGAs…

Full Text