Abstract

Heterogeneous computing is a growing trend in recent computer architecture design and is often used to improve the performance and power efficiency for computing applications by utilizing the special-purpose processors or accelerators, such as the Graphic Computing Unit (GPU), Field Programmable Gate Array (FPGA) and Digital Signal Processor (DSP). With the increase of complexity, the interaction among accelerators and processors could be deadfall if a race condition happens. However, the existing tools for such task are either too slow or hard to extend the race condition detection mechanism. Therefore, tools for application profiling with approximate timing model are important to the design of such heterogeneous systems in a timing manner. In this paper, we proposed a pluggable GPU interface on an existing timing approximate CPU simulator based on QEMU for analyzing the memory behavior of heterogeneous systems. Monitoring the memory behavior, the pluggable interface could be extended to any kinds of accelerators, such as GPU, DSP and FPGA, for race condition detection. Taking the GPU as an example, we integrated the detailed GPU simulator from Multi2Sim with the existing timing approximate CPU simulator, VPA, to showcase the efficiency of the proposed work. The experimental results showed that the emulation speed of the proposed framework reached at most 9x faster than Multi2Sim with acceptable timing results accuracy which is less than 20% error rate from our previous work. In addition, the race condition detection mechanism further indicates the problematic memory accesses to user.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call