Abstract

Codes that have large-stride/irregular-stride (L/I) memory access patterns, e.g., sparse matrix and linked list codes, often perform poorly on mainstream clusters because of the general purpose processor (GPP) memory hierarchy. High performance reconfigurable computers (HPRC) contain both GPPs and field programmable gate arrays (FPGAs) connected via a high-speed network. In this research, simple 64-bit floating-point codes are used to illustrate the runtime performance impact of L/I memory accesses in both software-only and FPGA-augmented codes and to assess the benefits of mapping L/I-type codes onto HPRCs. The experiments documented herein reveal that large-stride software-only codes experience severe performance degradation. In contrast, large-stride FPGA-augmented codes experience minimal performance degradation. For experiments with large data sizes, the unit-stride FPGA-augmented code ran about two times slower than software. On the other hand, the large-stride FPGA-augmented code ran faster than software for all the larger data sizes. The largest showed a 17-fold runtime speedup.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.