Abstract
Many-Accelerator (MA) systems have been introduced as a promising architectural paradigm that can boost performance and improve power of general-purpose computing platforms. In this paper, we focus on the problem of resource under-utilization, i.e. Dark Silicon, in FPGA-based MA platforms. We show that except the typically expected peak power budget, on-chip memory resources form a severe under-utilization factor in MA platforms, leading up to 75 percent of dark silicon. Recognizing that static memory allocation—the de-facto mechanism supported by modern design techniques and synthesis tools—forms the main source of memory-induced Dark Silicon, we introduce a novel framework that extends conventional high level synthesis (HLS) with dynamic memory management (DMM) features, enabling accelerators to dynamically adapt their allocated memory to the runtime memory requirements, thus maximizing the overall accelerator count through effective sharing of FPGA's memories resources. We show that our technique delivers significant gains in FPGA's accelerators density, i.e. 3.8 $\times$ , and application throughput up to 3.1 $\times$ and 21.4 $\times$ for shared and private memory accelerators.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have